[PREVIOUS] [CONTENTS] [NEXT]
Issues in Science and Technology Librarianship Spring 1999
DOI:10.5062/F4PN93MK

URLs in this document have been updated. Links enclosed in {curly brackets} have been changed. If a replacement link was located, the new URL was added and the link is active; if a new site could not be identified, the broken link was removed.

Electronic Journals as a Component of the Digital Library

Laurie E. Stackpole
Richard James King
Ruth H. Hooker Research Library
Naval Research Laboratory
Washington, DC 230-5334
laurie.stackpole@nrl.navy.mil
james.king@nrl.navy.mil

Abstract

The Naval Research Laboratory (NRL) Library provides its users in four geographic locations with digital library services including over 500 electronic journals. The InfoWeb Information System and Gateway serves as a single point of access to these services. InfoWeb capabilities prominently featured in this article include the following: (1) the TORPEDO Ultra Digital Library Initiative, (2) the Library's web-based Unicorn STILAS catalog, (3) a locally-mounted "Web of Science" Science Citation Index Expanded database, and (4) Contents-to-Go, a custom-developed current awareness and document delivery service. TORPEDO Ultra provides institutional researchers with a unified way to search all the Library's digital collections, whether these materials originate from a commercial publisher such as Elsevier; an association publisher such as the American Physical Society; or a government publisher such as NRL. In addition. TORPEDO Ultra provides an infrastructure that enables databases that have traditionally served as pointers to documents to now deliver that content directly. In a future release, TORPEDO Ultra will enable end-user scientists to search and link to journals residing on publisher web sites at the same time that they are searching locally mounted electronic journals.

Introduction

At the Naval Research Laboratory (NRL), which serves as the Navy's corporate research capability, employees in four geographic locations enjoy 24 x 7 access to databases, information products, reference tools, technical reports, and journals needed in the course of their work. Services are available to researchers working in their offices, at home, or on travel through a web-based information system and gateway called InfoWeb ({http://library.nrl.navy.mil/index.cfm}). InfoWeb was developed by the Ruth H. Hooker Research Library at NRL to meet the information needs of 3,500 Federal staff members and about 1,500 on-site contractors located in Washington, D.C.; Bay St. Louis, Mississippi; and Monterey, California. In addition, InfoWeb services are available to NRL's parent organization, the Office of Naval Research, in Arlington, Virginia.

Digital Library Context

InfoWeb serves as a single point of entry to the NRL "digital library" (Atkinson, Stackpole & Yokley 1995; Atkinson & Stackpole 1995). It not only provides web access to the Library's online catalog, but it also facilitates researcher-library communication through the use of online request forms, e-mail links to staff, and a suggestion box capability.

InfoWeb also provides employees with an organized approach to finding information that is openly available on the web, eliminating the frustration of finding thousands of "hits" in response to a web search. The Library has organized frequently requested information into broad subject categories: Computer Support, Government Information, Internet Directories, and Science Resources. Library staff mine the web for sites that contain information of particular interest to NRL researchers. InfoWeb links are then created and annotations are added so that researchers know in advance what they can expect to find at each site that is available through InfoWeb. Every month an automated program checks links to be sure they are active. In addition, the staff continually adds new sites and re-evaluates those that are on the system to be sure they haven't gone out of date. This information is accessible outside NRL and is heavily used by other libraries and end users in government, academia, and the general public.

The InfoWeb resources most in demand by the NRL community are, as would be expected, those that the Library licenses or pays for. These include databases that the Library provides by linking to external web sites, and those databases that the Library mounts locally on its own servers. Examples of licensed databases that reside on remote sites are: the Institution of Electrical Engineers (IEE) INSPEC database covering physics, electrical engineering and electronics, computing and control, and information technology; and OCLC FirstSearch, providing access to more than 40 databases in all subject areas. Locally mounted databases include: Science Citation Index Expanded and the NTIS database of government publications.

Digital Library Implications

While online library catalogs and databases are prerequisites for a digital library, they are not sufficient to qualify a library as "digital." Because they most frequently serve as pointers to documents, they do not meet the fundamental goal of the digital library, which is to deliver the full content of library materials to the desktop.

One way in which libraries can create content, and ultimately share it on an enterprise-wide basis, is by digitizing institutional publications that are not protected by copyright. The NRL Library has been scanning its technical reports collection since 1989, and has converted about 180,000 reports (nine million pages) to a digital format. Because most of these reports have limitations on dissemination, they are currently available only within the Library to staff and authorized users. As the Library begins to provide services over a secure network, access to all of this content will become more widely available. At this time, the full content of some 6,000 unrestricted reports is openly available through InfoWeb, either through a search of the Library's web-based catalog or through the TORPEDO Ultra digital repository.

A more typical way of providing content for access by library users is to license it. For example, NRL has a license with Elsevier Science permitting it to mount electronic versions of journals on a local server for institutional access. Elsevier was the first publisher to offer electronic journal subscriptions on a commercial basis with a 1996 offering that included 1995 back files. The NRL Library was an early implementer of EES and currently has over 200 Elsevier journals archived in TORPEDO Ultra and available to the user community through InfoWeb.

Cooperative agreements are another approach to building digital content and can help minimize the risk for both publishers and libraries in testing the digital waters. The Technology Transfer Act of 1986 provides the authority for government organizations and other types of organizations to enter into a Cooperative Research and Development Agreement (CRADA). The NRL Library recently completed a CRADA with the American Physical Society (APS) that resulted in the creation of a 10-year digital archive called Physical Review Online Archive (PROLA). Under the terms of this agreement, much of the PROLA content is also mounted locally at NRL. The NRL Library and the American Institute of Physics (AIP) recently signed a similar CRADA agreement, which will expand the AIP's web-based journal offerings and provide NRL users with access to 17 new locally mounted journals going back to 1992.

In addition to locally-mounted content, the NRL Library provides InfoWeb links to over 280 journals that reside on remote sites; examples include the publications of scholarly associations such as the American Chemical Society, as well as journals produced by commercial publishers such as Springer-Verlag.

Digital Library Repository

TORPEDO Ultra is the key component of the NRL digital library initiative, storing the Library's digital content and providing search and retrieval capabilities for accessing it. TORPEDO Ultra has been developed by the NRL Library to provide end users with the ability to browse and search its rapidly growing digital collection. This collection currently consists of 6,000 technical reports, 2,000 NRL press releases, 10,000 NRL-authored articles and conference papers, and over 200,000 articles from more than 200 journals. The total size of this collection is well over two million pages.

TORPEDO Ultra allows users to browse the Library's digital holdings in much the same way that users browse a library's physical stacks. For example, a TORPEDO Ultra browse lets users "drill down" to a specific article by selecting a particular journal from an alphabetic or subject list of journals, selecting a volume from the list of available volumes, selecting an issue from the list of issues in that volume, and finally selecting the article from the table of contents for that issue. Users can also search individual journals, groups of journals, reports, or the entire collection. For its search engine, TORPEDO Ultra relies on a commercial off-the-shelf (COTS) search engine called RetrievalWare from Excalibur Technologies. With RetrievalWare, users can perform field searches, say for all articles by a certain author, or they can search the full text of digital documents. RetrievalWare facilitates retrieval by permitting different types of full text searches. A concept search allows users to define terms using an online dictionary. Concept searches can be refined, for example by instructing the system to find "climate" only in its meteorological sense. Concept searches can also be expanded by instructing the system to find terms that are even loosely related to the search term. A pattern search uses a patented algorithm, called Adaptive Pattern Recognition Processing, to find near matches and is useful in compensating for possible OCR errors. A Boolean search option is also available. Search results for concept and pattern searches are presented in a user-defined sort, with the default set for relevance ranking so the "best" hits are displayed first.

Once users identify an article or report they're interested in from the displayed list of results, they can display it in PDF format using the Adobe Acrobat Reader. There are currently two types of PDF files in TORPEDO Ultra: those that have been produced from scanned bit-mapped images (wrapped PDF) and those that are have been generated as part of the publication process (distilled PDF). All PDF files added to TORPEDO Ultra are optimized by the Library for fast retrieval and enhanced with thumbnails of all pages to facilitate document navigation.

Content-Enabling Catalogs, Databases, and Services

While TORPEDO Ultra provides NRL with a true "digital library," it serves another, equally important, function. Because TORPEDO Ultra documents are stored with a persistent URL based on the bibliographic characteristics of the document, they can be readily "fetched" by other databases in use by NRL and ONR researchers. TORPEDO Ultra therefore provides the underpinning for content-enabling a broad range of library services that previously were only able to "point" users to documents such as reports and journal articles. The Library has capitalized on this TORPEDO Ultra capability and is in the process of content-enabling three of its most heavily used services: its web-based catalog, its locally mounted Science Citation Index Expanded database, and an e-mail journal alerting service, called Contents-to-Go.

The Library has added URLs to its online catalog for some 6,000 technical reports and 10,000 other NRL-authored publications that are available digitally in TORPEDO Ultra. This provides end users performing a web-based catalog search with immediate access to the full content of NRL technical reports and publications. Activating a hyperlink that says "Full Content" launches the Acrobat Reader and displays the PDF image of the first page of the report along with thumbnail page images. A similar process takes catalog users to digital holdings of journals that are in the TORPEDO Ultra collection. In the case of journals, clicking on the hyperlink connects users to the journal in TORPEDO, where they can browse digital holdings or perform a search.

One the most frequently used InfoWeb databases is Science Citation Index Expanded produced by the Institute for Scientific Information (ISI) and mounted at NRL for the use of all employees served by a consortium of four federal science libraries known as the National Research Library Alliance (Stackpole & Atkinson 1998). NRL is currently working with ISI to add hyperlinks to those articles in Science Citation Index Expanded that are stored in TORPEDO Ultra. As new articles are added to TORPEDO Ultra, identifying information and URLs will be sent via FTP to ISI and automatically added to the next weekly database update installed by NRL. A user searching the citation database will see a "Full Text" button for any article that is stored digitally in TORPEDO Ultra. Clicking on the button will display the PDF file. Authorizations built into the database display the button only to users who are permitted to see a particular article, allowing each consortium member to customize its digital holdings and enter into individual license agreements with publishers. In phase two of this project, links to the ISI bibliographic record will be added to TORPEDO Ultra metadata to enable users to link to article references (in the citation database) and display the cited article (in TORPEDO Ultra). In a third phase, links will be added to journals licensed by consortium members for access from publisher web sites.

One of the most popular "digital" services the NRL library has offered over the past several years is called Contents-to-Go. In the beginning, Contents-to-Go had no digital content at all. It combined an automated e-mail alerting service with old-fashioned document delivery (i.e., photocopies sent via interoffice mail).

Here's how Contents-to-Go works. The Library subscribes to electronic tables of contents for all journals in its collection. The tables of contents, which are e-mailed to a library mail server, are automatically redistributed to users who have "subscribed" to the service for specific journals. An InfoWeb interface allows users to select as many journals as they want by clicking a check box. Subscribing requires a user to enter a valid e-mail address, which enables the subscription module to automatically build the mailing lists. No staff intervention is required at any time. Users can add or subtract journals whenever they choose. The e-mail messages that go out to users have customized headers explaining how to request copies of articles, using the e-mail reply function.

An interim step in adding digital content to Contents-to-Go was to add a URL to the header of e-mailed tables of contents for those journals that are stored digitally. This URL takes users into the TORPEDO Ultra system where they can then browse or search for the specific article in which they are interested. Similarly links have been added to Contents-to-Go e-mail messages that take users to journals that are available from publisher web sites. In the very near future the Library will provide users with the ability to link from the e-mailed table of contents to each individual article in TORPEDO Ultra. This will be accomplished by replacing vendor-provided tables of contents for those journals that are part of TORPEDO Ultra with tables of contents automatically created by the TORPEDO Ultra system each time a new journal issue is added. These system-generated e-mailed tables of contents will incorporate URLs for each article providing users with enhanced access to digital journal content.

Hardware Configuration

InfoWeb, Science Citation Index Expanded, and TORPEDO Ultra run on a Sun Enterprise 4000 server under the Sun Solaris operating system. Documents are stored on a Sun Storage Array RAID Level 5 array, which also accommodates the nearly 1 TB of online storage required by the Citation Index. The library catalog system runs on a separate Sun system, a Sun Ultra 170.

Network access to restricted InfoWeb services is controlled by IP filtering, using Unix TCP-wrappers and IP address ranges.

Advantages of TORPEDO Ultra

TORPEDO Ultra provides a number of advantages to NRL researchers over other web-based journal retrieval systems offered by individual publishers or content aggregators:

Future Development

One of the primary goals of TORPEDO Ultra is to provide users with a single interface for searching the electronic journals of many publishers. For locally archived journals, as well as other materials, this goal has been met. However, for those journals that are resident on publisher web sites, users continue to encounter many diverse approaches to browsing and searching, which vary widely from publisher to publisher. Since any given subject area is usually represented by the journals of multiple publishers, it is not uncommon for users to need to be conversant with the protocols of seven or eight different publishers. The Library has created a hyperlinked alphabetical journal list in InfoWeb to enable users to finesse the need to associate a journal with its publisher. However, users are still, for the most part, limited to searching across journals produced by a single publisher, or hyperlinking from references to articles that are part of that publisher's archive.

While NRL anticipates implementing agreements with many more publishers for local journal archiving, in some cases it finds that publishers will agree to provide access to digital journals only from their own web site. In the beginning, the rationale for this resistance appeared to be the fear "losing control" over copyrighted content. However, more recently, NRL is finding that publishers, particularly those that have invested heavily in enhancing their web journal offerings, are increasingly reluctant to allow libraries to load journals locally. Examples of common publisher enhancements are links from references to records in auxiliary databases or digital archives, forward and backward hyperlinking to citing or cited articles, and links to external web sites. The availability of an enhanced hyperlinked web journal appears to cause publishers to view digital versions of the printed publication as inferior products, which they are hesitant to provide to libraries.

TORPEDO Ultra offers a solution to the problem of simultaneously searching or browsing journals on remote web sites along with locally mounted publications. Implementing this solution requires that the journal content, which would remain on the publisher web site, be indexed in TORPEDO Ultra. Users could then browse or search TORPEDO Ultra and click on a hyperlinked title to display the full document in PDF regardless of whether it resides locally or on a remote server. NRL is currently exploring two approaches to provide users with such unified searching of locally mounted and distributed documents. Both approaches, which will probably both be available as publisher options, achieve the same end result; they provide the two types of data required by TORPEDO Ultra: bibliographic records for fielded searching and display, and access to the full text for indexing purposes.

In the first approach, which is currently being tested with the American Physical Society, the publisher has provided sample article content in SGML format. The intent is for NRL to extract bibliographic data from the delivered SGML file and to index the full text of the entire article for searching. An alternative approach, to be tested later this year with the American Institute of Physics (AIP), will allow NRL to use a RetrievalWare Spider to index journal content on the AIP web site; the publisher will provide associated bibliographic data from the AIP SPIN database for use in TORPEDO Ultra.

Conclusion

The NRL Library has made significant progress in providing its user community in four geographic locations with single point of access to the information needed to support scientific research. Locally-mounted and remote databases and publications are available to researchers through a web-based information system and gateway known as InfoWeb. In addition to linking users to journals on publisher web sites, InfoWeb serves as the portal to a local digital repository with sophisticated online browse and search capabilities, called TORPEDO Ultra. Over 200 journals from three publishers are currently available for browsing or searching through TORPEDO Ultra. In addition, these journals are linked from the library's catalog, from a locally-mounted Web of Science database, and from a journal alerting service, providing a spectrum of content-capable services. In support of its goal to provide users with a unified approach to retrieving journal content regardless of location, the Library has developed and is testing a strategy to link users to journals that reside on publisher web sites through a TORPEDO Ultra browse or search.

References

Atkinson, Roderick D., Stackpole, Laurie E. & Yokley, John. 1995. Developing the Scientific-Technical Digital Library at a National Laboratory. In: Digital Libraries: Current Issues, Digital Libraries Workshop DL '94, Newark, NJ, USA, May 1994, Selected Papers (ed. by Nabil R. Adam, Bharat K. Bhargava, and Yelena Yesha), pp. 265-279. Springer-Verlag, Berlin. [Online]. Available: {http://infoweb.nrl.navy.mil/NRL_publications/digital_library_94.html} [May 17, 1999]

Atkinson, Roderick D. & Stackpole, Laurie E. 1995. TORPEDO: Networked Access to Full-Text and Page-Image Representations of Physics Journals and Technical Reports. The Public-Access Computer Systems Review 6(3). [Online]. Available: {http://epress.lib.uh.edu/pr/v6/n3/atki6n3.html} [May 17, 1999]

Stackpole, Laurie E. & Atkinson, Roderick D. 1998. The National Research Library Alliance: A Federal Consortium Formed to Provide Inter-Agency Access to Scientific Information. Issues in Science & Technology Librarianship, No. 18, Spring 1998. [Online]. Available: http://www.istl.org/98-spring/article6.html [May 17, 1999]

FEEDBACK

We welcome your comments about this article.

[PREVIOUS] [CONTENTS] [NEXT]

W3C   3.2 Checked!