Previous   Contents   Next
Issues in Science and Technology Librarianship
Winter 2001
DOI:10.5062/F4BV7DK2

URLs in this document have been updated. Links enclosed in {curly brackets} have been changed. If a replacement link was located, the new URL was added and the link is active; if a new site could not be identified, the broken link was removed.

Database Reviews and Reports

Research Index

Melissa Holmberg
Electronic Resources/Science Librarian
Memorial Library
Minnesota State University, Mankato
melissa.holmberg@mnsu.edu

Research Index, formerly known as CiteSeer, is a free citation index for computer science and other disciplines utilizing technology available at {http://citeseer.ist.psu.edu/cs}. Unlike most search engines, this database indexes postscript and PDF documents in addition to HTML files. The database parses the citations, identifies citations to the same paper, determines the context of citations within the documents, and indexes the full text. This combination of actions allows the user to search for articles by keyword, title, or author, find scholarly documents that cite a particular article, and look at the context of citations made within and to a particular article.

Search Page

Upon accessing the search page (see figure 1 below), users are invited to "read the welcome message and query instructions." The query instructions remain extremely brief as only information about adjacency searching exists.

Image of Research Index opening screen

Figure 1: Search page of Research Index

Below the search box users can click on various links which can be utilized for limited browsing. One of the links, "Computer Science Directory," appears to be a partial index of topics and sub-topics (see figure 2). Clicking on any of topics results in a list of citations and articles with a banner of related linked topics. The other browsing areas include documents most accessed and authors and documents most cited.

Image of Research Index Computer Science   Directory

Figure 2: Computer Science Directory, or limited keyword index, in Research Index

When the limited browsing does not cover the user's needs, the search box amidst the various banners and links can be utilized. Search capabilities are limited to AND, OR, NOT, phrases (words next to each other), and adjacency searches with the w/# command. Research Index automatically stems terms. Pressing enter after typing a search strategy will default to searching the citations. Clicking on the "Search Articles" bar will retrieve full-text documents available in the database.

List of Results

The list of search results (see figure 3) presents the post-search field limits for the title and header fields -- the header field includes the title and author -- just below the top banner. Following the field limits are options for reordering the list of results and changing the number of records listed per page. Although the option to reorder results by date exists, users may prefer the default display by rank. Despite the creators' efforts to remove duplicates through complex algorithms, identical records often appear listed one after the other in the default display and sometimes become separated when sorted by date.

The list of results appears below the banner. The title is linked to the detailed record. Following the title, the "correct" link allows users to send citation changes to the database creators. A display of the search terms in context follows these links.

Image of search results screen
Figure 3: Results List

Detailed Record

The detailed record (see figure 4) presents several options for the user. The banner alone includes at least 15 links when searching articles. Below the top banner are nine informational sections related to the document, none of which includes the document itself.

Image of article record
Figure 4: Partial detailed record display

For most undergraduate students, probably only a few items will be important. The abstract, just below the banner, will obviously help them determine if they want to download the article. The other important areas include the links on the right side of the banner. When the document is available, the first link(s), unlabeled, gives the URL(s) for the original document. The "cached" links below the URL provide alternative formats for a cached copy of the original document. Although most computer science researchers will have an application which reads postscript files, the cached PDF links can be helpful to libraries providing only browser and Adobe Acrobat Reader support. The cached image links display postscript documents with just a browser, but only one page at a time may be viewed or output.

Some students might find the "From" and "Home" links useful. The "From" link provides the original page on which the document was found. The "Home" link(s) provide the homepage(s) of the authors, which could be useful for finding other research materials, locating author's opinions and theories, and ascertaining the author's credibility. Serious researchers may also find the "Track Related" link, near the middle of the banner, to be a helpful feature. Searchers can use the "Track Related" link to set up an e-mail alert for new citations and documents added to the database which cite the current document, are cited by the current citation, and/or cover similar topics.

Other information given through the detailed record include "Site Documents," the list of other documents listed on the same site the article was found; "Cited by," the list of publications found which cited the current record; "Active bibliography," a list of documents found via various algorithms to be similar to the current record; a list of documents viewed by users who also looked at this record; related documents found through co-citation analysis; BibTeX citation; and the bibliography. Additionally, users can view the context of the citations citing the current record as well as the context of documents being cited.

Output of Results

Research Index does not include internal mechanisms to print, e-mail, or download content. Searchers must follow the output procedures of the application used to display the document(s).

Advantages of Research Index

Research Index locates documents posted to the web. Since several authors post pre-prints to their web sites much earlier than the articles appear in printed journals, searchers may find more current information than they would through commercial databases. The autonomous nature of Research Index keeps the cost of maintaining the index much lower than other citation indexes, which are often manually created, and thus provides a free alternative or complement to other indexes. Through the various algorithms, the database can also give up-to-date impact measures of particular articles.

Another advantage of Research Index is the availability of the algorithms, software, and data for non-commercial use. Such availability may provide similar databases for other disciplines as well as provide a great learning opportunity for students.

Disadvantages of Research Index

Traditional indexes provide a list of journals and subjects which help researchers select the appropriate search tool. Research Index does not cover a core set of journals or a core set of subjects. Thus, with Research Index, searchers must remember they may not find anything for a particular topic or citation.

Additionally, free resources do not maintain relationships with librarians. Thus, updates are "discovered," not reported. Despite an available e-mail list, updates are still not reported to interested users. Furthermore, the help screens and search tips are extremely minimal. Most information about Research Index must be gleaned from the creators' articles about the database.

Comparisons to Other Databases

Comparing Research Index to other databases seems difficult given the different formats indexed in the resources. However, to stress the value of this free resource, the author conducted a quick search on a current hot topic. Searching Bluetooth in Research Index brought up 54 unique documents. The same search in Computer Abstracts resulted in one record. In Computer and Information Systems Abstracts, the search found 7 records. Searching Bluetooth in Compendex resulted in 35 records. Very little overlap appeared among the results. While such results may not occur for all searches, it does indicate the utility of Research Index for current topics.

Since Research Index was designed as a citation index, it seemed reasonable to also compare results to a commercial citation index. Searching for cited references to Bollacker produced no results in either SciSearch or Social SciSearch. In Research Index, though, seven documents, including four self-citations, cite the article "Digital Libraries and Autonomous Citation Indexing," written by Lawrence, Giles, and Bollacker. Thus, Research Index again serves as a good complement to the commercial databases.

Conclusions

Although it is difficult to discover interface changes, several alterations in layout and browsing capabilities have occurred during the past 18 months. Such updating brings hope for other improvements. One desired improvement would be a more comprehensive index, or "computer science directory," for the students who have not yet narrowed their topics. Informative labels for the sections on the detailed record would help users as well. For example, "site documents" is a confusing label for the novice user. Fewer overlapping access points on the same screen for the same information would also be helpful. While multiple access points are often favored, it becomes confusing when they exist on the same scrollable page. Lastly, grouping similar sections together may help limit the number of choices facing users. Through the link for "Related Documents," the "Active Bibliography," "Users who viewed this document also viewed," and "Related documents from co-citation" sections can be viewed together on one page. The combination of such groupings with the elimination of the multiple access points would greatly improve the user's understanding of the information presented.

Ultimately, despite the disadvantages and the needed improvements, Research Index offers another computer science resource for locating quality information. In comparison to commercial resources, it complements the researcher's needs by providing access to different resources. Lastly, although the database cannot provide a concrete list of subjects, the author has yet to be unsuccessful in finding articles in Research Index for needed information in the computer science field or in other fields utilizing computer technologies.

Bibliography

Giles, C. Lee, Bollacker, Kurt D. and Lawrence, Steve. 1998. "Citeseer: an automatic citation indexing system" in Digital Libraries 98 - Third ACM Conference on Digital Libraries, pp. 89-98. New York: ACM Press.

Lawrence, Steve, Giles, C. Lee, and Bollacker, Kurt. 1999. "Digital libraries and autonomous citation indexing." Computer 31(6): 67-71.

Previous   Contents   Next

W3C   4.0 Checked!