Issues in Science and Technology Librarianship | Winter 2004 | |||
DOI:10.5062/F40C4SQB |
Helen F. Smith
Agricultural Sciences Librarian
The Pennsylvania State University
hfs1@psu.edu
Claire E. Hoffman
Head Librarian, Abington College Library
The Pennsylvania State University
ceh8@psu.edu
The Libraries at the Pennsylvania State University subscribe to the online databases Web of Science and Current Contents Connect. Concern was expressed regarding the great similarity in coverage between them. A comparison of title coverage found that Web of Science was more inclusive than Current Contents Connect across all disciplines. When updating frequency was compared, new science and social science journal issues appeared in both databases the same week approximately three quarters of the time. In the arts and humanities this is true only about half the time but these data are not as conclusive due to the small sample size. Each database has unique features. Web of Science has superior title coverage, while Current Contents Connect updates faster about 25% of the time. Unexpected significant problems were noted with updates to Current Contents Connect regarding timing of the updates and the definition of a "current week." The relative importance of the advantages and disadvantages of the two databases will vary depending on institutional needs.
The Institute for Scientific Information (ISI) in Philadelphia, PA, produces publications of both types. Their flagship print publication, Science Citation Index (SCI), falls into the first category. This index, along with its two sister publications, Social Science Citation Index (SSCI) and Arts and Humanities Citation Index (AHCI), uses what ISI calls Permuterm Subject Indexing. Keywords within an article title are combined with other keywords in the same title in a rudimentary form of Boolean logic. This allows the user to be more precise in identifying relevant articles in a subject search. Corporate and individual author searching are also available. What makes these publications unique, is that they also index the references cited at the end of the source papers. As is typical of the indexing publications, these three sacrifice speed of publication for the extra features. The paper AHCI comes out twice a year, SSCI three times a year, and SCI six times a year.
ISI's best-known current awareness publications are a septet known collectively as Current Contents (CC). Each part of this set is devoted to a particular group of subjects: Life Sciences; Agriculture, Biology, and Environmental Sciences; Physical, Chemical and Earth Sciences; Clinical Medicine; Engineering, Computing, and Technology; Social and Behavioral Sciences; and Arts and Humanities. These publications are largely reproductions of journals tables of contents. They have minimal author and title indexing. The science and technology sections are published weekly; other sections are published at least every other week. In the case of the science sections ISI claims to have a two-week lag time between the time they receive a journal issue and the information being published in a CC issue (Institute for Scientific Information 2000).
Current Contents Connect, the electronic version of the print septet, is updated daily; Web of Science, the electronic incarnation of the three citation indexes, is updated every week. This is comparable in frequency to many print current awareness products, including Current Contents. Although separate subscriptions are available to each of the citation indexes and the seven Current Contents editions, the Pennsylvania State University has subscribed to all of them via the Web of Science (WOS) and Current Contents Connect (CCC) services (Table 1).
Web of Science | Current Contents Connect |
---|---|
Science Citation Index |
|
Social Sciences Citation Index | Social and Behavioral Sciences |
Arts & Humanities Citation Index | Arts and Humanities |
Anecdotal information suggested that the two databases were more obviously similar to each other than their print predecessors were. Title coverage was compared, and Table 2 shows the results of this comparison. With the exception of one arts and humanities title, all the titles in CCC were also indexed in WOS. The opposite was not true in that an average of 10 percent of the titles indexed in WOS were not indexed in CCC. Some titles were indexed in more than one section, causing the total title number for 'All Sections" to be less than the sum of the titles from each section.
Description | All Sections | Sciences | Social Sci. | Arts & Hum. |
---|---|---|---|---|
Total titles (number) | 8356 | 5829 | 1740 | 1139 |
Titles in both WOS & CCC | 90% | 86% | 92% | 98% |
Titles in WOS only | 10% | 15% | 8% | 2% |
Titles in CCC only | <0.1% | 0% | 0% | <0.1% |
Searching and interface features aside, the other area of interest is the updating frequency of the databases. The purpose of this study was to compare the updating frequency of Current Contents Connect and Web of Science.
as described by Yamane (1973). In this formula, n represents the total sample size, N is the size of the total population, and e represents the rate of error, which we chose to be 5 percent. This was the set of titles chosen by the random method.
Several problems arose with the titles chosen by this random method:
Although some extra titles had been selected, there were not enough for the arts and humanities or the science sections. Additional titles were added from a list of journals most frequently cited by authors at Penn State-University Park. These titles were used in order from most to less frequently cited, compiled by averaging the number of citations to each journal for the years 1997-1999, according to data from each of the three Citation Indexes. This resulted in a set of titles chosen via the citation method.
The social sciences list did not experience this problem, and at the end of the study, we actually had more titles than we needed. In order to retain the correct subject proportions, a few titles were randomly eliminated from the list in order to reduce it to the correct size. These titles were chosen for elimination using a random number table (Beyer 1987). The sample size for each method of choosing titles (random or citation) is listed in Table 3.
Description | Random Sample Size | Citation Sample Size | Total Sample Size |
---|---|---|---|
Number of science titles | 219 | 37 | 256 |
Number of social sciences titles | 76 | 0 | 76 |
Number of arts & humanities titles | 47 | 3 | 50 |
The study was conducted for a total of ten weeks between 28 July 2000 and 22 September 2000. Each Friday all the titles in the study were searched in both databases in order to determine whether or not any new issues of the title had been added to the database during the previous week. This process continued for the next eight weeks. The tenth week was a "wrap-up" week. If an issue had been added to one database but not to the other, then that title was searched.
Description | Random Titles Not Updated | Citation Titles Not Updated | Total Not Updated | |||
---|---|---|---|---|---|---|
# | % | # | % | # | % | |
Number of science titles | 48 | 22 | 0 | 0 | 48 | 19 |
Number of social sciences titles | 33 | 43 | 0 | 0 | 33 | 43 |
Number of arts & humanities titles | 27 | 2 | 30 |
Description | Random Titles Updated | Citation Titles Updated | Total Titles Updated | |||
# | % | # | % | # | % | |
Number usable issues | 300 | 58 | 214 | 42 | 514 | 100 |
Number issues updated same time | 217 | 72 | 162 | 76 | 379 | 74 |
Number issues CCC updated first | 82 | 27 | 51 | 24 | 133 | 26 |
Number of issues WOS updated first | 1 | <1 | 1 | <1 | 2 | <1 |
Description | Random Titles Updated | Citation Titles Updated | Total Titles Updated | |||
# | % | # | % | # | % | |
Number usable issues | 55 | 100 | 0 | 0 | 55 | 100 |
Number issues updated same time | 40 | 73 | 0 | 0 | 40 | 73 |
Number issues CCC updated first | 15 | 27 | 0 | 0 | 15 | 27 |
Number of issues WOS updated first | 0 | 0 | 0 | 0 | 0 | 0 |
Description | Random Titles Updated | Citation Titles Updated | Total Titles Updated | |||
# | % | # | % | # | % | |
Number usable issues | 26 | 96 | 1 | 4 | 27 | 100 |
Number issues updated same time | 14 | 54 | 0 | 0 | 14 | 52 |
Number issues CCC updated first | 11 | 42 | 1 | 100 | 12 | 44 |
Number of issues WOS updated first | 1 | 4 | 0 | 0 | 1 | 4 |
In the course of this project, a significant and disturbing fact was noted regarding Current Contents Connect's definition of a current week. CCC allows users to limit their searches to selected date spans. Since it has traditionally been a print publication that appeared weekly it is logical to assume that one of the limit periods in the electronic version would be a week's worth of data. CCC has a limit labeled "current week", which according to the internal database help "includes journal issues and Current Book Contents for the current week. The span of dates given in parentheses defines the current week. Because Current Contents data are updated daily, this date span changes daily." However the implication of the phrase "current week" to a user is that it represents a seven day period. It was found that the "Current Week" could be anywhere from one to eight days. The date ranges defining "current week" were recorded when the searches were done. These are shown in the Table 8.
Monday | Tuesday | Wednesday | Thursday | Friday | Saturday |
---|---|---|---|---|---|
28 July 27th-27th |
29 July 27th-29th |
||||
31 July 27th-29th |
1 August 27th-31st |
2 August 27th-1st |
3 August 27th-3rd |
4 August 3rd-3rd | 5 August |
7 August 3rd-4th |
8 August 3rd-4th |
9 August 3rd-8th |
10 August 3rd-10th |
11 August 10th-10th | 12 August |
14 August 10th-11th |
15 August 10th-14th |
16 August 10th-15th |
17 August 10th-15th |
18 August 17th-17th | 19 August |
21 August 17th-18th |
22 August 17th-21st |
23 August 17th-22nd |
24 August 17th-24th |
25 August 24th-24th | 26 August |
28 August 24th-25th |
29 August 24th-28th |
30 August 24th-30th |
31 August 31st-31st |
1 September 31st-1st | 2 September |
4 September HOLIDAY |
5 September 31st-1st |
6 September 31st-6th |
7 September 31st-6th |
8 September 7th-7th | 9 September |
11 September 7th-8th |
12 September 7th-11th |
13 September 7th-12th |
14 September 7th-12th |
15 September 7:21am: 7th-14th 11:48am:14th-14th | 16 September |
18 September 14th-15th |
19 September 14th-18th |
20 September 7:37am: 14th-18th 11:25am: 14th-18th 11:26am: 14th-20th |
21 September 14th-21st |
22 September 7:30am: 14th-21st 9:00am: 21st-21st | 23 September |
25 September 21st-22nd |
26 September 21st-25th |
27 September 21st-25th |
28 September 21st-28th |
29 September 21st-28th | 30 September |
Although the study was conducted in 2000, an examination of the time spans indicated for current weeks during September 2002, indicate that the situation still exists. The database is still apparently updated during the working day (Eastern Time), and the time period covered by a "current week" can be as little as one day or as much as seven. If a scholar regularly runs a search each week, the coverage of the material retrieved might show significant gaps, depending on the day of the week and the time of day that the search was conducted. Unlike CCC, Web of Science is consistently updated once a week. No matter how early the checks were run on Fridays during the study period, Web of Science was already showing its new update.
Bottle, R.T., ed. 1979. Use of Chemical Literature. 3rd ed. London: Butterworths.
Institute for Scientific Information. 2000. Current Contents: Physical, Chemical & Earth Sciences. 40(3): 1.
Yamane, T. 1973. Statistics, an Introductory Analysis. 3rd ed. New York: Harper and Row.
Linda Musser, Head, Earth & Mineral Sciences Library, Penn State, for her comments and suggestions on the manuscript.