BIBLIOGRAPHIC RETRIEVAL FROM BIBLIOGRAPHIC INPUT; 
THE HYPOTHESIS AND CONSTRUCTION OF A TEST 

Frederick H. RUECKING, Jr.: Head, Data Processing Division, 
The Fondren Library, Rice University, Houston, Texas 

227 

A study of problems associated with bibliographic retrieval using unveri-
fied input data supplied by requesters. A code derived from compression 
of title and author information to four, four-character abbreviations each 
was used for retrieval tests on an IBM 1401 computer. Retrieval accuracy 
was 98.67%. 

Current acquisitions systems which utilize computer processing have 
been oriented toward handling the order request only after it has been 
manually verified. Systems, such as that of Texas A & I University (1), 
have proven useful in reducing certain clerical routines and in handling 
fund accounting ( 2). Lack of a larger bibliographic data base and lack 
of adequate computer time have prevented many libraries from studying 
more sophisticated acquisitions systems. 

At the time the MARC Pilot Project ( 3) was started, the Fondren Li-
brary at Rice University did not have operating computer applications in 
acquisitions, serials, or cataloging. The University administration and the 
Research Computation Center provided sufficient access to the IBM 7040 
to permit the study of problems associated with bibliographic retrieval 
using input data which has varying accuracy. 

In 1966, Richmond expressed the concern of many librarians about the 
lack of specific statements describing the techniques by which on-line re-
trieval could be accomplished without complicating the problems pre-
sented by the current card catalog ( 4). She had previously described 
some of the problems created by the kind and quality of data being uti-
lized as references by library users ( 5). 


228 Journal of Library Automation Vol. 1/ 4 December, 1968 

An examination of the pertinent literature indicates that most of the 
current work in retrieval, while related to problems of bibliographic re-
trieval, does not offer much assistance when the input data is suspect ( 6, 
7,8 ). Tainiter and Toyoda, for example, have described different tech-
niques of addressing storage using known input data ( 9,10). 

One of the best-known retrieval systems is that of the Chemical Abstracts 
Service, which provides a fairly sophisticated title-scan of journal articles 
with a surprising degree of flexibility in the logic and term structure used 
as input. Comparable systems are used by the Defense Documentation 
Center, Medlars Centers, and NASA Technology Centers. These systems 
have one specific feature in common: a high level of accuracy in the input 
data. 

USER-SUPPLIED BIBLIOGRAPHIC DATA 

The reliability of bibliographic data supplied to university libraries from 
faculty and students has long been questioned ( 5). Any search system 
which accepts such data must be designed 1) to increase the level of con-
fidence through machine-generated search structures and variable thresh-
holds and 2) to reduce the dependence upon spelling accuracy, punctu-
ation, spacing and word order. 

The initial task of formulating an approach to this problem is to deter-
mine the type, quality, and quantity of data generally supplied by a user. 
To derive a controlled set of data for this purpose, the Acquisition Depart-
ment of the Fondren Library provided Xerox copies of all English language 
requests dated 1965 or later and a random sample of 295 requests was 
drawn from that file of 5000 items. 

This random sample was compared to the manually-verified, original 
order-requests to determine 1) the frequency with which data was sup-
plied by the requestor and 2) the accuracy of the provided information. 
Results of this study are given in Table 1. 

Table 1. Level of Confidence in the Input Data 

Data Times Times Level of 
Elements Given Correct Accuracy Confidence 
Edition 295 294 99.6 99.6 
Title 295 292 99.0 99.0 
Author 290 264 91.0 82.7 
Publish. 268 218 81.3 73.9 
Date 265 215 81.1 72.8 

The results suggest that edition can have great significance when speci-
fied and should be used as strong supporting evidence for retrieval. It 
should not necessarily be a restrictive element because of the low-order 
magnitude of actual specification, which was five times in the sample. 
(Unstated editions were considered as first editions, and correct. ) 


Bibliographic Retrievalj RUECKING 229 

Title is the most significant and most reliable element. As Richmond 
indicates, use of the entire title for searching would present distinct prob-
lems for retrieval systems ( 4) . Consequently, an abbreviated version of 
the title must be derived from the input data which will reduce the impact 
and significance of the problems described by Richmond (5). 

THE HYPOTHESIS 

It is hypothecated that retrieval of correct bibliographic entries can be 
obtained from unverified, user-supplied, input data through the use of a 
code derived from the compression of author and title information sup-
plied by the user. It is assumed that a similar code is provided for all en-
tries of the data base using the same compression rules for main and added 
entry, title and added title information. 

It is further hypothecated that use of weighting factors for individual 
segments of the code will provide accurate retrieval in those cases when 
exact matching does not occur. Before the retrieval methodology can be 
described, it is necessary to outline the compression technique to be used 
with author and title words. 

TITLE COMPRESSION 

To gain some understanding of the problems to be faced in compressing 
title information, a random sample of 500 titles was drawn from the first 
half of the initial MARC I reel (about 4800 titles). Each of these titles 
was analyzed for significant words and tabulations were made on word 
strings and word frequencies. The following words. were considered as 
non-significant: a, an, and, by, if, in, of, on, the, to. 

The tabulated data, shown in Table 2, contain some surprising attributes. 
Approximately 90% of the titles contain less than five significant words, 
which suggests that four significant words will be adequate to match on 
title. 

Table 2. Significant Word Strings in Titles 
Length of Word String 

1 2 3 4 5+ Total 
Number of titles 42 151 179 76 52 500 
Percentage 8.4 30.2 35.8 15.2 10.4 100.0 
Cumulative Percentage 8.4 38.6 74.4 89.6 100.0 

Letting n stand for the corpus of words available for title use, the ran-
dom chance of duplicating any specific word in another title can be stated 

1 
as - . When a string of words is considered, the chance of randomly 

n 1 
selecting the same word string may be considered as -a, where 'a' is the 

n 
number of words in the string. 


230 Journal of Library Automation Vol. 1/ 4 December, 1968 

Certain words are used more frequently than others, and the occurrence 
of such words in a given string reduces the uniqueness of that string. The 
curve displayed in Figure 1 shows the frequency distribution of words in 
the sample. The mean frequency of words in the title-sample is 1.33. 

'iOO 

( )B~f 

800 

700 

600 
t.r) 

0 
a: 
0 

3.500 
lL. 
D 

0:: 
LLJ 
CDfOO 
~ 
=:I 
:z: 

3()() 

2IXJ 

\ 100 fi'}.. 
I~ K f+!.~ \' Jtl-' __() (I) I (~ _c[).l 

I z 3 '1- s 6 7 8 f/ 10 II /2 
Ffi!EQUENCY 

Fig. 1. Frequency Distribution of Words in Sample. 


Bibliographic RetrievaljRUECKING 231 

Therefore, the chance of selecting an identical word string can be more 
accurately expressed as: 

n" 

An examination of word lengths, as shown in Table 3, shows that 95% 
of the significant title words contain less than ten characters. An examina-
tion of the word list revealed that some 70% of the title words contain 
inflections and/ or suffixes. If these suffixes and inflections are removed, 
approximately 43% of the remaining word stems contain less than five 
characters and 59% contain less than six. 

Table 3. Distribution of Character Length and Stem Length 

Length in Total Different Percent Stems Percent 
Characters Words Words 

1 7 5 0.5 5 0.8 
2 25 14 1.3 14 2.3 
3 87 48 4.6 48 7.9 
4 172 117 11.1 196 32.3 
5 229 163 15.5 92 15.2 
6 198 153 14.5 94 15.5 
7 202 159 15.3 64 10.6 
8 158 122 11.6 45 7.4 
9 121 102 9.7 15 2.5 
10 84 69 6.6 8 1.3 
11 54 48 4.6 7 1.2 
12 38 28 2.7 2 0.3 
13 14 12 1.1 2 0.3 
14 6 4 0.4 0 0.0 
15 3 3 0.3 0 0.0 
16 2 2 0.2 0 0.0 

Summary 1400 1049 592 

The reduction of word length does affect the uniqueness of the individ-
ual word, merging distinct words into common word stems at a mean rate 
of 2.5 to 1.0. In Table 3 the difference between 1049 words and 592 stems 
reflects the reduction of similar words into a common stem; for example: 
America, American, Americans, Americanism, etc., into A.mer. Thus, the 
uniqueness of a string of title words is reduced to the following chance 
of duplication: 

(2.5 X 1.33 )• 3.3• 
n• or-n" 


232 Journal of Library Automation Vol. 1/ 4 December, 1968 

An analysis of consonant strings made by Dolby and Resnikoff provides 
frequencies of initial and terminal consonant strings occurring in 7000 
common English words (with suffixes and inflections removed) ( 11,12, 
13). These frequency lists clearly show that the terminal string of conso-
nants has considerable information-carrying potential in terms of word 
identification. The starting string also carries information potential, but 
significantly less than the terminal string. By combining the initial and 
terminal strings, it is possible to generate an abbreviation which has ade-
quate uniqueness and reduces the influence of spelling. 

The high percentage of four-character word stems and the fact that the 
maximum terminal string contains four consonants suggest the use of a 
four-character abbreviation. To compress a title word into four characters, 
it is necessary to specify a set of rules. The first rule will be to delete 
all suffixes and inflections which terminate a title word. The second rule 
will be to delete vowels from the stem until a consonant is located or 
the four-character stem is produced. The suffixes and inflections deleted 
in this procedure are contained in Table 4. When the stem contains more 
than four characters, the third compression rule states that the four-char-
acter field is filled with the terminal-consonant string and remaining posi-
tions are filled from the initial- character string. 

Table 4. Deleted Suffixes and Inflections 

-ic -ive -in -et 
-ed -ative -ain -est 

-aged -ize -on -ant 
-oid -ing -ion -ent 

-ance -og -ation -ient 
-ence -log -ship -ment 

-ide -olog -er -ist 
-age -ish -or -y 

-able -al -s -ency 
-ible -ial -es -ogy 
-ite -ful -ies -ology 
-ine -ism -ives -ly 
-ure -urn -ess -ry 
-ise -ium -us -ary 
-ose -an -ous -ory 
-ate -ian -ious -ity 
-ite 

The relative uniqueness of the generated abbreviation can be calcu-
lated using the data supplied by Dolby and Resnikoff. For example, Car-
ter and Bonk's Building Library Collections would be abbreviated- BULD, 
LIBR,COCT. The random chance of duplicating any abbreviation can be 
stated as consisting of the product of the random chance of duplicating 
the initial string and the random chance of duplicating the terminal string: 


Bibliographic Retrievalj RUECKlNG 233 

fl ft 
-X- x3.32 
n1 nt 

The frequencies listed by Dolby and Resnikoff may be substituted in 
the above equation producing the following chances for duplication: 

324 63 1 
x - - x 10.89 = -- for BULD 

6800 6800 208 

288 

6800 

277 

6800 

1 1 
x 

6800 
x 10.89 = 

14745 
for LIBR 

16 1 
x 

6800 
x 10.89 = 

1041 
for COCT 

The random chance of duplicating this string of three abbreviations can 
be calculated by multiplying the individual calculations, which yields the 
random chance of 1 in 32 x 108• This high uniqueness declines rapidly 
when the title contains less than three significant words and contains high 
frequency words, such as the title Collected Works, for which the same 
uniqueness calculation produces the random chance of 1 in 44 x 104• 

To increase the level of uniqueness on short titles, like Collected Works, 
it becomes necessary to provide supporting data to the title information. 
It is clear that the supporting data must come from supplied author text. 

AUTHOR COMPRESSION 

The same compression algorithms can be used for both personal and 
corporate names with some modifications. The frequent· substitution of 
"conference" for "congress" and "symposia" for "symposium" suggests that 
meeting names should be considered as a secondary sub-set of non-signifi-
cant words. Names of organizational divisions, such as bureau, department, 
ministry, and office, can be considered as part of the same sub-set. 

The rules which govern the deletion of inflections, suffixes and vowels 
can be used for corporate names, but personal author names must be car-
ried into the compression routine without modificatjon. Only the last name 
of an author would be compressed into a code. 

CONSTRUCTING THE TEST 

Four, four-character abbreviations are allowed for title compression and 
four for author. Rather than use a 32-character fixed field for these codes, 
the lengths of the input and main-base codes are variable, with leading 
control digits to specify the individual code sizes for the title and author 
segments. . 

Provision is made for the inclusion of date, publisher and/ or edition in 
the search-code sh·ucture although these were not implemented in the 
test performed. . 


234 Journal of Library Automation Vol. 1/ 4 December, 1968 

At the time the input data is read, the existence of title, author, edition, 
publisher and date is indicated by the setting of indicators which control 
the matching mask and which, in part, control the specification of the 
retrieve threshhold. The title indicator specifies the number of compressed 
words in the supplied title which must be matched by the base code. 

A simple algorithm is used to calculate the threshhold values given in 
columns two through four of Table 5. Columns five through seven are 
obtained by adding two to the calculated threshholds. Each agreement 
within the mask adds to a retrieve counter the values indicated in the 
last five columns of Table 5, the values of X and Y being the number of 
matching code words in the title and author segments respectively. 

CONDUCTING THE TEST 
As mentioned above, the initial tests of the retrieve were based upon 

title and author matching exclusively and required three runs on the 
Fondren Library's 1401 computer. The first loaded 2874 original order-
requests, generated a search code utilizing the rules specified in this paper 
and created an input tape. The second run extracted title and author 
data from the MARC I data base, created multiple search codes for title, 
main entry, added title and added entry. Both tapes were sorted into 
ascending search-code sequence. 

The final run was the search program which attempted to match input 
codes with the MARC I base codes. When there was agreement based 
on relationship of threshhold and retrieve counter, the printer displayed 
threshhold, short author and short title on one line, and retrieve value, 
input author and title on the next line as illustrated in Figure 2. The 
printed results were compared to validate the accuracy of the retrieve. 
This comparison was cross-checked against the results of the acquisition 
department's manual procedures. 

The search program also provided for an attempt to match titles on the 
basis of a rearrangement of title words. In such attempts the retrieve 
threshhold was raised. 

ANALYSIS OF RESULTS 
The raw data obtained from this experimental run are shown in Table 
6. Of the 287 4 items represented in the input file , 48.4%, or 1392, were 
actually found to exist in the data base. Of those actually present 90.4%, 
or 1200, were extracted with an overall accuracy of 98.67%. 

An examination of the sixteen false drops revealed several omissions 
in the compression routines for the input data and for the data base. One 
of the more significant omissions was failing to compensate for multi-char-
acter abbreviations, particularly 'ST.' and 'STE.' for 'Saint.' A subroutine 
for acceptance of such abbreviations added to the search-code generating 
program would increase the retrieve accuracy to 99%. 


Table 5. Values for Variable Threshhold 

Data Threshhold Values Agreement Values 
Given Full-Code Test Individual Code Test Title Author Edition Publish. Date 

TAEPD 3 or 4 2 1 3 or 4 2 1 

XYlll 12 8+2Y 4+2Y 14 10+2Y 6+2Y 4X 2Y 3 2 1 
XYllO 12 8+2Y 4+2Y 14 10+2Y 6+2Y 4X 2Y 3 2 1 

XYlOl 12 8+2Y 4+ 2Y 14 10+2Y 6+2Y 4X 2Y 3 2 1 

XYlOO 12 8+2Y 4+ 2Y 14 10+2Y 6+2Y 4X 2Y 3 2 1 
XYOll 12 8+2Y 4+2Y 14 10+2Y 6+2Y 4X 2Y 3 2 1 l::x; .... 
XYOlO 12 8+2Y 4+ 2Y 14 10+2Y 6+2Y 4X 2Y 3 2 1 ~ g:' 
XY001 12 8+2Y 4+2Y 14 10+2Y 6+2Y 4X 2Y 3 2 1 

(1Q 

~ 
"';j 

XYOOO 12 8+2Y 4+2Y 14 18+2Y 6+2Y 4X 2Y 3 2 1 ;;:to .... 
~ 

XOlll 12 11 7 13 12 7 4X 2Y 3 2 1 ::x; 
{';) 

xouo 12 11 7 13 12 7 4X 2Y 3 2 1 
..... 
"'t .... 
{';) 
c: 

X0101 12 11 7 13 12 7 4X 2Y 3 2 1 ~ 

"' X0100 12 11 7 13 11 7 4X 2Y 3 2 1 !:l:l c::: 
xoou 12 10 6 13 11 7 4X 2Y 3 2 1 trl (") 

p.:: 
XOOlO 12 10 6 13 Not permitted 4X 2Y 3 2 1 -z 
X0001 12 9 5 13 Not permitted 4X 2Y 3 2 1 

0 

1:-0 xooo 12 Not permitted Not permitted c.:> CJ1 


10 4ME R4M8RHCHS 
10 AM~R4MBRHCHS 

Ob AME~BOLL 
Ob AMEii.BOLL 

12 AMERBUSQSHOWZIEN 
12 AMERBUSQSHOWZEIN 

12 AMERCNTRCAMPBRTH 
12 AMERCNTRCAMP 

12 AHERJEWSISRLISCS 
1~ AMERJEWSISRLISCSHALOR 

12 AMEROCCPSTCTBLAU 
1~ AMEROCCPSTCTBLAU 

12 AMEROCCPSTCTOUNN 
12 AHEROCCPSTCTBLAU 

12 AMERPARTSYSMCHRS 
14 AHERPARTSYSMCHRS 

10 AMERPREOWARN 
10 AMERPREOWARN 

10 AMERSCHKilLCK 
10 4MERSCHKBLCK 

10 AMERSCHOSEXI'i 
10 AMERSCHOSEXNPATCCAYO 

12 AMERSPACEXPRSHtN 
1~ AMERSPACEXPRSHEN 

12 AMERTHETTOOAOOWR 
1~ AMERTHETTOOAOOWR 

12 AME R THTii.AS SEENBRWN 
11> AMERTHTRAS SEE NMOS SMONSJ 

12 AMEiHHTRAS SEENMOSS 
18 A!'IERTHTRAS SEEI'iMOSSMO'ISJ 

12 AN4ZPHPHARGUMCGL 
12 ANAZPHPHARGUMCGFJAN PHIP 

12 ANCIHUNTFAR WESTPOUD 
18 ANCIHUNTFAR WE STPOUO 

Fig. 2. Sample of Retrieved Citations. 

HEINRICHS, WALDO H. 
HEINRICHS* 

BOSWELL, CHARLES. 
BOSWELlt 

liEOMAN, IRVING . 
lEIOMANt 

BOSWORTH, ALLAN R. 
CLAY, c. T.t 

ISAACS, HAROLD ROBERT; 
ISAACS, HAROLD R.t 

BLAU, PETER MICHAEL. 
BLAUt 

DUNCAN, OTIS OUOLEYo JO 
BLAUt 

CHAMBERS, WILLIAM NISBET 
CHAHBERSt 

WARREN, StONEY, 1916 -
WARREI'it 

BLACK, HILLEL. 
BLACKt 

SEXTON, PATRICIA CAYO. 
SEXTOI'io PATRICIA CAYOt 

SHELTON, WILLIAM ROYo 
SHELTONt 

DOWNER, ALAN SEYMOUR, 
OOWNERt . 

BROWN, JOHN MASON, 1900 
HOSES, MOi'iTROSE J.t 

AMERICAN AMBASSAOOR JOSEPH C. GR 
AMERICAN AMBASSAOJRt 

THE AMERICA THE STORY OF THE WORL 
THE AMERICA. THE STORY OF THE WORLD 

THE AMERICAN BURLESQUE SHOW. 
THE AMERICAN BURLESQUE SHOWt 

AMERICA-S CONCENTRATION CAMPS BY 
AMERICA-S CONCENT~ATION CAMPSt 

AMERICAI'i JEWS IN ISRAEL BY HAAO 
AMERICAI'i JEWS IN ISRAELt 

THE AMERICAI'i OCCJPATIONAL STRUCTUR 
THE AMERICAN etCUPATIONAL STRUCTURE 

THE AMERICAN OCC~PATIONAl STRUCTUR 
THE AMERICAN OCCUPATIONAL STRUCTURE 

THE AMERICAN PARTY SYSTEMS STAGES 
THE AMERICAN PART~ SYSTEMS• STAGES 

THE AMERICAN PRESIDENT 
THE AMER[CAN PRESIOENTt 

THE AMERICAN SCHJOLBOOK. 
THE AMERICAN SCHOOLBOOK* 

READINGS 

THE AMERICAN SCHOOL A SOCIOLOGIC 
THE AMERICAN SCHOlL. A SOCIOLOGICAL 

AMERICAN SPACE EXPLORATION THE F 
AMERICAN SPAt~ EXPLORATION. THE FIR 

THE AMERICAN THEATER TODAY, EOITE 
THE AMERICAN THEATER. TODAY* 

THE AMERICAN THEATRE AS SEEN BY IT 
THE AMERICAN THEATRE AS SEEN BY ITS 

MOSES, MONTROSf JQNASo THE AMERICAN THE4TRE AS SEEN BY IT 
HOSES, MONTROSE J.t THE AMERICAN THEATRE AS SEEN BY ITS 

MCGREAL, IAN PHILIP, 19 ANALYZING PHILOSOPHICAL ARGUMENTS 
MCGREAF, JAN PHILLIPt ANALYZING PHILOSOPHICAL ARGUMENTS. 

POURAOE, RICHARD F. 
POURADE* 

ANCIENT HUNTERS JF THE FAR WEST, 
ANCIENT HUNTERS OF THE FAR WEST* 

~ 
o;, 

0" 

~ ....... 
.Q.. 
t"'l .... 
~ 

~ 
~ 

I e· 
;:$ 

< 
0 
r-
....... 
~ 

t1 
(!) 
(') 
(!) 

g. 
(!) 

..:-: 
....... 
CD 

85 


Bibliographic RetrievaljRVECKING 237 

Table 6. Table of Results 

Retrieve Total Correct False Percentage 
Values Hits Hits Hits Correct 

6 14 14 0 100 
8 0 0 0 0 

10 311 311 0 100 
12 264 248 16 93.3 
14 232 232 0 100 
16 118 118 0 100 
18 260 260 0 100 
20 1 1 0 100 

Totals 1200 1184 16 98.7 

Table 7. Distribution of Errors 

Title Errors Author Errors 
No. of Title Author Author 
Codes Error Spelling Lacking Error Spelling Other Total 

1 2 3 10 12 27 4 58 
2 2 6 17 26 60 23 134 
3 0 0 0 0 0 0 0 
4 0 0 0 0 0 0 0 

Total 4 9 27 38 87 27 192 

The occurrence of titles with the words "selected". or "collected," etc., 
produced additional false drop when the title word string exceeded two 
words. A modification to the search program to raise the threshhold when 
the input data contain codes such as 'SECT; 'COCT' would increase the 
retrieve accuracy to 99.17% 

The presence of personal names in titles, such as 'Charles Evans Hughes' 
and 'Franklin Delano Roosevelt' caused seven additional false drops. At 
present it seems unlikely that a simple method to prevent them can be 
included. 

CONCLUSION 

The experimental results indicate that the hypothesis suggested is valid. 
Use of multiple codes for added entry, added title in addition to the main 
entry, and main title data are clearly necessary. Approximately 10% of 
the correctly retrieved items were produced by the existence of an added 
entry code. 

The influence of spelling accuracy was lessened by use of a compres-
sion technique. An inspection of extracted titles revealed the existence of 
43 spelling errors which did not affect retrieval. Thus, the search code 
reduced the significance of spelling by some 30%. 

Utilizing table search followed by table look-up and linking random-


238 Journal of Library Automation Vol. 1/ 4 December, 1968 

access addresses, should enable the search code approach to bibliographic 
retrieval to provide rapid, direct access to the title sought. 

ACKNOWLEDGMENT 

This study was supported in part by National Science Foundation grants 
GN-758 and GU-1153 and by the Regional Information and Communica-
tion Exchange. The assistance of the Acquisitions Department staff, the 
Research Computation Center staff and the staff of the Fondren Library's 
Data Processing Division is gratefully acknowledged. 

REFERENCES 

1. Morris, Ned C.: "Computer Based Acquisitions System at Texas 
A & I University," Journal of Library Automation, 1 (March 1968 ), 
1-12. 

2. Wedgeworth, Robert: "Brown University Library Fund Accounting 
System," I ournal of Library Automation, 1 (March 1968), 51-65. 

3. U. S. Library of Congress: Project MARC, an Experiment in Auto-
mating Library of Congress Catalog Data (Washington: 1967). 

4. Richmond, Phyllis A.: "Note on Updating and Searching Computer-
ized Catalogs," Library Resources and Technical Services, 10 (Spring 
1966), 155-160. 

5. Richmond, Phyllis A.: "Source Retrieval," Physics Today, 18 (April 
1965)' 46-48. 

6. Atherton, P.; Yorich, J. C.: Three Experiments with Citation Indexing 
and Bibliographic Coupling of Physics Literature (New York, Ameri-
can Institute of Physics, 1962). 

7. International Business Machines Corporation: Reference Manual, Index 
Organization for Information Retrieval (IBM, 1961). 

8. International Business Machines Corporation: A Unique Computable 
Name Code for Alphabetic Account Numbering (White Plains, N.Y.: 
IBM, 1960). 

9. Tainiter, M.: "Addressing Random-Access Storage with Multiple 
Bucket Capacities," Association for Computing Machinery Journal, 
10 (July 1963 ), 307-315. 

10. Toyoda, Junichi; Tazuka, Yoshikazu; Kasahara, Yoshiro: "Analysis of 
the Address Assignment Problems for Clustered Keys," Association 
for Computing Machinery Journal, 13 (October 1966), 526-532. 

11. Dolby, James L.; Resnikoff, Howard L.: "On the Structure of Written 
English Words," Language, 40 (Apr-June 1964), 167-196. 

12. Resnikoff, Howard L.; Dolby, James L.: "The Nature of Affixing in 
Written English, Part I," Mechanical Translation, 8 (March 1965), 
84-89. 

13. Resnikoff, Howard L.; Dolby, James L.: "The Nature of Affixing in 
Written English, Part II," Mechanical Translation, 9 (June 1966), 
23-33.