id sid tid token lemma pos 8961 1 1 The the DT 8961 1 2 Binary Binary NNP 8961 1 3 Vector Vector NNP 8961 1 4 as as IN 8961 1 5 the the DT 8961 1 6 Basis basis NN 8961 1 7 of of IN 8961 1 8 an an DT 8961 1 9 Inverted Inverted NNP 8961 1 10 Index Index NNP 8961 1 11 File File NNP 8961 1 12 Donald Donald NNP 8961 1 13 R. R. NNP 8961 1 14 KING KING NNP 8961 1 15 : : : 8961 1 16 Rutgers Rutgers NNP 8961 1 17 University University NNP 8961 1 18 , , , 8961 1 19 New New NNP 8961 1 20 Brunswick Brunswick NNP 8961 1 21 , , , 8961 1 22 New New NNP 8961 1 23 Jersey Jersey NNP 8961 1 24 . . . 8961 2 1 307 307 CD 8961 2 2 The the DT 8961 2 3 inverted inverted JJ 8961 2 4 index index NN 8961 2 5 file file NN 8961 2 6 is be VBZ 8961 2 7 a a DT 8961 2 8 frequently frequently RB 8961 2 9 used use VBN 8961 2 10 file file NN 8961 2 11 structure structure NN 8961 2 12 for for IN 8961 2 13 the the DT 8961 2 14 storage storage NN 8961 2 15 of of IN 8961 2 16 indexing indexing NN 8961 2 17 information information NN 8961 2 18 in in IN 8961 2 19 a a DT 8961 2 20 document document JJ 8961 2 21 retrieval retrieval NN 8961 2 22 system system NN 8961 2 23 . . . 8961 3 1 This this DT 8961 3 2 paper paper NN 8961 3 3 de- de- CC 8961 3 4 scribes scribe VBZ 8961 3 5 a a DT 8961 3 6 novel novel JJ 8961 3 7 method method NN 8961 3 8 for for IN 8961 3 9 the the DT 8961 3 10 computer computer NN 8961 3 11 storage storage NN 8961 3 12 of of IN 8961 3 13 such such PDT 8961 3 14 an an DT 8961 3 15 index index NN 8961 3 16 . . . 8961 4 1 The the DT 8961 4 2 method method NN 8961 4 3 not not RB 8961 4 4 only only RB 8961 4 5 offers offer VBZ 8961 4 6 the the DT 8961 4 7 possibility possibility NN 8961 4 8 of of IN 8961 4 9 reducing reduce VBG 8961 4 10 storage storage NN 8961 4 11 requirements requirement NNS 8961 4 12 fot fot NNP 8961 4 13 an an DT 8961 4 14 index index NN 8961 4 15 but but CC 8961 4 16 also also RB 8961 4 17 affords afford VBZ 8961 4 18 more more JJR 8961 4 19 mpid mpid JJ 8961 4 20 processing processing NN 8961 4 21 of of IN 8961 4 22 query query NN 8961 4 23 statements statement NNS 8961 4 24 ex- ex- RB 8961 4 25 pressed press VBN 8961 4 26 in in IN 8961 4 27 Boolean boolean JJ 8961 4 28 logic logic NN 8961 4 29 . . . 8961 5 1 INTRODUCTION INTRODUCTION NNP 8961 5 2 The the DT 8961 5 3 inverted invert VBN 8961 5 4 index index NN 8961 5 5 file file NN 8961 5 6 is be VBZ 8961 5 7 a a DT 8961 5 8 frequently frequently RB 8961 5 9 used use VBN 8961 5 10 file file NN 8961 5 11 structure structure NN 8961 5 12 for for IN 8961 5 13 the the DT 8961 5 14 storage storage NN 8961 5 15 of of IN 8961 5 16 indexing indexing NN 8961 5 17 information information NN 8961 5 18 in in IN 8961 5 19 document document JJ 8961 5 20 retrieval retrieval NN 8961 5 21 systems system NNS 8961 5 22 . . . 8961 6 1 An an DT 8961 6 2 inverted invert VBN 8961 6 3 index index NN 8961 6 4 file file NN 8961 6 5 may may MD 8961 6 6 be be VB 8961 6 7 used use VBN 8961 6 8 by by IN 8961 6 9 itself -PRON- PRP 8961 6 10 or or CC 8961 6 11 with with IN 8961 6 12 a a DT 8961 6 13 direct direct JJ 8961 6 14 file file NN 8961 6 15 in in IN 8961 6 16 a a DT 8961 6 17 so so RB 8961 6 18 - - HYPH 8961 6 19 called call VBN 8961 6 20 combined combine VBN 8961 6 21 file file NN 8961 6 22 system system NN 8961 6 23 . . . 8961 7 1 The the DT 8961 7 2 inverted inverted JJ 8961 7 3 index index NN 8961 7 4 file file NN 8961 7 5 contains contain VBZ 8961 7 6 a a DT 8961 7 7 logical logical JJ 8961 7 8 record record NN 8961 7 9 for for IN 8961 7 10 each each DT 8961 7 11 of of IN 8961 7 12 the the DT 8961 7 13 subject subject JJ 8961 7 14 headings heading NNS 8961 7 15 or or CC 8961 7 16 index index NN 8961 7 17 terms term NNS 8961 7 18 which which WDT 8961 7 19 may may MD 8961 7 20 be be VB 8961 7 21 used use VBN 8961 7 22 to to TO 8961 7 23 describe describe VB 8961 7 24 documents document NNS 8961 7 25 in in IN 8961 7 26 the the DT 8961 7 27 system system NN 8961 7 28 . . . 8961 8 1 Within within IN 8961 8 2 each each DT 8961 8 3 logical logical JJ 8961 8 4 record record NN 8961 8 5 there there EX 8961 8 6 is be VBZ 8961 8 7 a a DT 8961 8 8 list list NN 8961 8 9 of of IN 8961 8 10 pointers pointer NNS 8961 8 11 to to IN 8961 8 12 those those DT 8961 8 13 documents document NNS 8961 8 14 which which WDT 8961 8 15 have have VBP 8961 8 16 been be VBN 8961 8 17 indexed index VBN 8961 8 18 by by IN 8961 8 19 the the DT 8961 8 20 subject subject NN 8961 8 21 heading heading NN 8961 8 22 in in IN 8961 8 23 question question NN 8961 8 24 . . . 8961 9 1 The the DT 8961 9 2 individual individual JJ 8961 9 3 pointers pointer NNS 8961 9 4 are be VBP 8961 9 5 usually usually RB 8961 9 6 in in IN 8961 9 7 the the DT 8961 9 8 form form NN 8961 9 9 of of IN 8961 9 10 document document NN 8961 9 11 numbers number NNS 8961 9 12 stored store VBN 8961 9 13 in in IN 8961 9 14 fixed fix VBN 8961 9 15 - - HYPH 8961 9 16 length length NN 8961 9 17 digital digital JJ 8961 9 18 form form NN 8961 9 19 . . . 8961 10 1 Obviously obviously RB 8961 10 2 , , , 8961 10 3 the the DT 8961 10 4 length length NN 8961 10 5 of of IN 8961 10 6 the the DT 8961 10 7 lists list NNS 8961 10 8 will will MD 8961 10 9 vary vary VB 8961 10 10 from from IN 8961 10 11 record record NN 8961 10 12 to to IN 8961 10 13 record record NN 8961 10 14 . . . 8961 11 1 The the DT 8961 11 2 purpose purpose NN 8961 11 3 of of IN 8961 11 4 this this DT 8961 11 5 paper paper NN 8961 11 6 is be VBZ 8961 11 7 the the DT 8961 11 8 presentation presentation NN 8961 11 9 of of IN 8961 11 10 a a DT 8961 11 11 new new JJ 8961 11 12 technique technique NN 8961 11 13 for for IN 8961 11 14 the the DT 8961 11 15 storage storage NN 8961 11 16 of of IN 8961 11 17 the the DT 8961 11 18 lists list NNS 8961 11 19 of of IN 8961 11 20 pointers pointer NNS 8961 11 21 to to IN 8961 11 22 documents document NNS 8961 11 23 . . . 8961 12 1 It -PRON- PRP 8961 12 2 will will MD 8961 12 3 be be VB 8961 12 4 shown show VBN 8961 12 5 that that IN 8961 12 6 this this DT 8961 12 7 technique technique NN 8961 12 8 not not RB 8961 12 9 only only RB 8961 12 10 reduces reduce VBZ 8961 12 11 storage storage NN 8961 12 12 requirements requirement NNS 8961 12 13 , , , 8961 12 14 but but CC 8961 12 15 that that IN 8961 12 16 in in IN 8961 12 17 many many JJ 8961 12 18 cases case NNS 8961 12 19 the the DT 8961 12 20 time time NN 8961 12 21 required require VBN 8961 12 22 to to TO 8961 12 23 search search VB 8961 12 24 the the DT 8961 12 25 index index NN 8961 12 26 is be VBZ 8961 12 27 reduced reduce VBN 8961 12 28 . . . 8961 13 1 The the DT 8961 13 2 technique technique NN 8961 13 3 is be VBZ 8961 13 4 useful useful JJ 8961 13 5 in in IN 8961 13 6 systems system NNS 8961 13 7 which which WDT 8961 13 8 use use VBP 8961 13 9 Boolean boolean JJ 8961 13 10 searches search NNS 8961 13 11 . . . 8961 14 1 The the DT 8961 14 2 relative relative JJ 8961 14 3 merits merit NNS 8961 14 4 of of IN 8961 14 5 Boolean boolean JJ 8961 14 6 and and CC 8961 14 7 weighted weighted JJ 8961 14 8 term term NN 8961 14 9 searches search NNS 8961 14 10 are be VBP 8961 14 11 beyond beyond IN 8961 14 12 the the DT 8961 14 13 scope scope NN 8961 14 14 of of IN 8961 14 15 this this DT 8961 14 16 paper paper NN 8961 14 17 , , , 8961 14 18 as as IN 8961 14 19 are be VBP 8961 14 20 the the DT 8961 14 21 rela- rela- JJ 8961 14 22 tive tive JJ 8961 14 23 merits merit NNS 8961 14 24 of of IN 8961 14 25 the the DT 8961 14 26 various various JJ 8961 14 27 possible possible JJ 8961 14 28 file file NN 8961 14 29 structures structure NNS 8961 14 30 . . . 8961 15 1 THE the DT 8961 15 2 BINARY BINARY NNP 8961 15 3 VECTOR VECTOR NNP 8961 15 4 AS as IN 8961 15 5 A a DT 8961 15 6 STORAGE storage NN 8961 15 7 DEVICE device NN 8961 15 8 The the DT 8961 15 9 exact exact JJ 8961 15 10 form form NN 8961 15 11 of of IN 8961 15 12 each each DT 8961 15 13 document document NN 8961 15 14 pointer pointer NN 8961 15 15 is be VBZ 8961 15 16 immaterial immaterial JJ 8961 15 17 to to IN 8961 15 18 the the DT 8961 15 19 user user NN 8961 15 20 of of IN 8961 15 21 a a DT 8961 15 22 document document JJ 8961 15 23 retrieval retrieval NN 8961 15 24 system system NN 8961 15 25 as as RB 8961 15 26 long long RB 8961 15 27 as as IN 8961 15 28 he -PRON- PRP 8961 15 29 is be VBZ 8961 15 30 able able JJ 8961 15 31 to to TO 8961 15 32 obtain obtain VB 8961 15 33 the the DT 8961 15 34 document document NN 8961 15 35 he -PRON- PRP 8961 15 36 desires desire VBZ 8961 15 37 . . . 8961 16 1 The the DT 8961 16 2 standard standard JJ 8961 16 3 form form NN 8961 16 4 for for IN 8961 16 5 these these DT 8961 16 6 pointers pointer NNS 8961 16 7 in in IN 8961 16 8 most most RBS 8961 16 9 automated automate VBN 8961 16 10 systems system NNS 8961 16 11 is be VBZ 8961 16 12 a a DT 8961 16 13 document document NN 8961 16 14 number number NN 8961 16 15 . . . 8961 17 1 Note note VB 8961 17 2 that that IN 8961 17 3 each each DT 8961 17 4 pointer pointer NN 8961 17 5 is be VBZ 8961 17 6 by by IN 8961 17 7 itself -PRON- PRP 8961 17 8 a a DT 8961 17 9 piece piece NN 8961 17 10 of of IN 8961 17 11 infor- infor- JJ 8961 17 12 mation mation NN 8961 17 13 . . . 8961 18 1 However however RB 8961 18 2 , , , 8961 18 3 if if IN 8961 18 4 one one PRP 8961 18 5 thinks think VBZ 8961 18 6 of of IN 8961 18 7 a a DT 8961 18 8 " " `` 8961 18 9 peek peek JJ 8961 18 10 - - HYPH 8961 18 11 a a DT 8961 18 12 - - HYPH 8961 18 13 boo boo NN 8961 18 14 " " '' 8961 18 15 system system NN 8961 18 16 , , , 8961 18 17 the the DT 8961 18 18 document document NN 8961 18 19 308 308 CD 8961 18 20 Journal Journal NNP 8961 18 21 of of IN 8961 18 22 Library Library NNP 8961 18 23 Automation Automation NNP 8961 18 24 Vol Vol NNP 8961 18 25 . . . 8961 19 1 7/4 7/4 CD 8961 19 2 December December NNP 8961 19 3 1974 1974 CD 8961 19 4 pointer pointer NN 8961 19 5 becomes become VBZ 8961 19 6 simply simply RB 8961 19 7 a a DT 8961 19 8 hole hole NN 8961 19 9 punched punch VBN 8961 19 10 in in IN 8961 19 11 a a DT 8961 19 12 card card NN 8961 19 13 . . . 8961 20 1 In in IN 8961 20 2 this this DT 8961 20 3 case case NN 8961 20 4 the the DT 8961 20 5 position position NN 8961 20 6 of of IN 8961 20 7 the the DT 8961 20 8 pointer pointer NN 8961 20 9 , , , 8961 20 10 not not RB 8961 20 11 the the DT 8961 20 12 pointer pointer NN 8961 20 13 itself -PRON- PRP 8961 20 14 , , , 8961 20 15 conveys convey VBZ 8961 20 16 the the DT 8961 20 17 information information NN 8961 20 18 . . . 8961 21 1 The the DT 8961 21 2 new new JJ 8961 21 3 technique technique NN 8961 21 4 presented present VBN 8961 21 5 in in IN 8961 21 6 this this DT 8961 21 7 paper paper NN 8961 21 8 is be VBZ 8961 21 9 an an DT 8961 21 10 extension extension NN 8961 21 11 of of IN 8961 21 12 the the DT 8961 21 13 " " `` 8961 21 14 peek- peek- JJ 8961 21 15 a a DT 8961 21 16 - - HYPH 8961 21 17 boo boo NN 8961 21 18 " " '' 8961 21 19 concept concept NN 8961 21 20 . . . 8961 22 1 A a DT 8961 22 2 vector vector NN 8961 22 3 or or CC 8961 22 4 string string NN 8961 22 5 of of IN 8961 22 6 binary binary JJ 8961 22 7 zeroes zero NNS 8961 22 8 is be VBZ 8961 22 9 constructed construct VBN 8961 22 10 equal equal JJ 8961 22 11 in in IN 8961 22 12 length length NN 8961 22 13 to to IN 8961 22 14 the the DT 8961 22 15 number number NN 8961 22 16 of of IN 8961 22 17 documents document NNS 8961 22 18 expected expect VBN 8961 22 19 in in IN 8961 22 20 the the DT 8961 22 21 system system NN 8961 22 22 . . . 8961 23 1 The the DT 8961 23 2 position position NN 8961 23 3 of of IN 8961 23 4 each each DT 8961 23 5 vector vector NN 8961 23 6 element element NN 8961 23 7 corresponds correspond VBZ 8961 23 8 to to IN 8961 23 9 a a DT 8961 23 10 document document NN 8961 23 11 number number NN 8961 23 12 . . . 8961 24 1 That that RB 8961 24 2 is is RB 8961 24 3 , , , 8961 24 4 the the DT 8961 24 5 first first JJ 8961 24 6 position position NN 8961 24 7 in in IN 8961 24 8 a a DT 8961 24 9 vector vector NN 8961 24 10 corresponds correspond VBZ 8961 24 11 to to TO 8961 24 12 document document VB 8961 24 13 number number NN 8961 24 14 one one CD 8961 24 15 and and CC 8961 24 16 the the DT 8961 24 17 tenth tenth JJ 8961 24 18 vector vector NN 8961 24 19 posi- posi- NN 8961 24 20 tion tion NN 8961 24 21 corresponds correspond NNS 8961 24 22 to to TO 8961 24 23 document document VB 8961 24 24 number number NN 8961 24 25 ten ten CD 8961 24 26 . . . 8961 25 1 A a DT 8961 25 2 vector vector NN 8961 25 3 is be VBZ 8961 25 4 constructed construct VBN 8961 25 5 for for IN 8961 25 6 each each DT 8961 25 7 subject subject JJ 8961 25 8 heading heading NN 8961 25 9 in in IN 8961 25 10 the the DT 8961 25 11 system system NN 8961 25 12 . . . 8961 26 1 As as IN 8961 26 2 a a DT 8961 26 3 document document NN 8961 26 4 enters enter VBZ 8961 26 5 the the DT 8961 26 6 system system NN 8961 26 7 , , , 8961 26 8 ones one NNS 8961 26 9 are be VBP 8961 26 10 in- in- RB 8961 26 11 serted serte VBN 8961 26 12 in in IN 8961 26 13 place place NN 8961 26 14 of of IN 8961 26 15 the the DT 8961 26 16 zeroes zero NNS 8961 26 17 in in IN 8961 26 18 the the DT 8961 26 19 positions position NNS 8961 26 20 corresponding correspond VBG 8961 26 21 to to IN 8961 26 22 the the DT 8961 26 23 new new JJ 8961 26 24 doc- doc- NN 8961 26 25 ument ument NN 8961 26 26 number number NN 8961 26 27 in in IN 8961 26 28 the the DT 8961 26 29 vectors vector NNS 8961 26 30 for for IN 8961 26 31 the the DT 8961 26 32 subject subject JJ 8961 26 33 headings heading NNS 8961 26 34 used use VBN 8961 26 35 to to TO 8961 26 36 describe describe VB 8961 26 37 the the DT 8961 26 38 document document NN 8961 26 39 . . . 8961 27 1 As as IN 8961 27 2 an an DT 8961 27 3 example example NN 8961 27 4 , , , 8961 27 5 assume assume VB 8961 27 6 the the DT 8961 27 7 following follow VBG 8961 27 8 document document NN 8961 27 9 descriptions description NNS 8961 27 10 are be VBP 8961 27 11 presented present VBN 8961 27 12 to to IN 8961 27 13 a a DT 8961 27 14 system system NN 8961 27 15 using use VBG 8961 27 16 binary binary JJ 8961 27 17 vectors vector NNS 8961 27 18 : : : 8961 27 19 Document document NN 8961 27 20 Number number NN 8961 27 21 1 1 CD 8961 27 22 2 2 CD 8961 27 23 3 3 CD 8961 27 24 Subject Subject NNP 8961 27 25 Headings Headings NNPS 8961 27 26 A a NN 8961 27 27 , , , 8961 27 28 B b NN 8961 27 29 , , , 8961 27 30 D D NNP 8961 27 31 C C NNP 8961 27 32 , , , 8961 27 33 E E NNP 8961 27 34 A a NN 8961 27 35 , , , 8961 27 36 C c NN 8961 27 37 The the DT 8961 27 38 binary binary JJ 8961 27 39 vectors vector NNS 8961 27 40 for for IN 8961 27 41 terms term NNS 8961 27 42 A a NN 8961 27 43 , , , 8961 27 44 B b NN 8961 27 45 , , , 8961 27 46 C C NNP 8961 27 47 , , , 8961 27 48 D d NN 8961 27 49 , , , 8961 27 50 and and CC 8961 27 51 E e NN 8961 27 52 before before IN 8961 27 53 the the DT 8961 27 54 insertion insertion NN 8961 27 55 of of IN 8961 27 56 the the DT 8961 27 57 indexing indexing NN 8961 27 58 data datum NNS 8961 27 59 would would MD 8961 27 60 be be VB 8961 27 61 as as IN 8961 27 62 follows follow VBZ 8961 27 63 : : : 8961 27 64 Subject subject JJ 8961 27 65 Heading head VBG 8961 27 66 A a NN 8961 27 67 B b NN 8961 27 68 c c NN 8961 27 69 D d NN 8961 27 70 E e NN 8961 27 71 Vector vector NN 8961 27 72 000 000 CD 8961 27 73 ... ... NFP 8961 27 74 0 0 CD 8961 27 75 000 000 CD 8961 27 76 ... ... NFP 8961 27 77 0 0 CD 8961 27 78 000 000 CD 8961 27 79 ... ... NFP 8961 27 80 0 0 CD 8961 27 81 000 000 CD 8961 27 82 ... ... NFP 8961 27 83 0 0 NFP 8961 27 84 ooo ooo NN 8961 27 85 ... ... NFP 8961 27 86 · · NFP 8961 27 87 o o XX 8961 27 88 After after IN 8961 27 89 the the DT 8961 27 90 insertion insertion NN 8961 27 91 of of IN 8961 27 92 the the DT 8961 27 93 indexing indexing NN 8961 27 94 information information NN 8961 27 95 , , , 8961 27 96 the the DT 8961 27 97 same same JJ 8961 27 98 vectors vector NNS 8961 27 99 would would MD 8961 27 100 appear appear VB 8961 27 101 as as IN 8961 27 102 follows follow VBZ 8961 27 103 : : : 8961 27 104 Subject subject JJ 8961 27 105 Heading head VBG 8961 27 106 A a NN 8961 27 107 B b NN 8961 27 108 c c NN 8961 27 109 D d NN 8961 27 110 E e NN 8961 27 111 Vector vector NN 8961 27 112 101 101 CD 8961 27 113 ... ... SYM 8961 27 114 0 0 CD 8961 27 115 100 100 CD 8961 27 116 ... ... SYM 8961 27 117 0 0 CD 8961 27 118 011 011 CD 8961 27 119 ... ... NFP 8961 27 120 0 0 CD 8961 27 121 100 100 CD 8961 27 122 ... ... SYM 8961 27 123 0 0 CD 8961 27 124 010 010 CD 8961 27 125 ... ... NFP 8961 27 126 0 0 NFP 8961 27 127 The the DT 8961 27 128 binary binary JJ 8961 27 129 vector vector NN 8961 27 130 seems seem VBZ 8961 27 131 to to TO 8961 27 132 have have VB 8961 27 133 several several JJ 8961 27 134 advantages advantage NNS 8961 27 135 over over IN 8961 27 136 the the DT 8961 27 137 standard standard JJ 8961 27 138 form form NN 8961 27 139 of of IN 8961 27 140 storage storage NN 8961 27 141 of of IN 8961 27 142 document document NN 8961 27 143 numbers number NNS 8961 27 144 in in IN 8961 27 145 an an DT 8961 27 146 inverted invert VBN 8961 27 147 file file NN 8961 27 148 . . . 8961 28 1 First first RB 8961 28 2 , , , 8961 28 3 the the DT 8961 28 4 rec- rec- NN 8961 28 5 ords ord NNS 8961 28 6 are be VBP 8961 28 7 of of IN 8961 28 8 fixed fix VBN 8961 28 9 length length NN 8961 28 10 since since IN 8961 28 11 the the DT 8961 28 12 vectors vector NNS 8961 28 13 are be VBP 8961 28 14 all all RB 8961 28 15 equal equal JJ 8961 28 16 in in IN 8961 28 17 length length NN 8961 28 18 to to IN 8961 28 19 the the DT 8961 28 20 ex- ex- XX 8961 28 21 pected pecte VBN 8961 28 22 number number NN 8961 28 23 of of IN 8961 28 24 documents document NNS 8961 28 25 in in IN 8961 28 26 the the DT 8961 28 27 system system NN 8961 28 28 . . . 8961 29 1 Space space NN 8961 29 2 may may MD 8961 29 3 be be VB 8961 29 4 left leave VBN 8961 29 5 at at IN 8961 29 6 the the DT 8961 29 7 end end NN 8961 29 8 of of IN 8961 29 9 each each DT 8961 29 10 vector vector NN 8961 29 11 for for IN 8961 29 12 the the DT 8961 29 13 addition addition NN 8961 29 14 of of IN 8961 29 15 new new JJ 8961 29 16 documents document NNS 8961 29 17 . . . 8961 30 1 Periodic periodic JJ 8961 30 2 copying copying NN 8961 30 3 of of IN 8961 30 4 the the DT 8961 30 5 file file NN 8961 30 6 may may MD 8961 30 7 be be VB 8961 30 8 used use VBN 8961 30 9 to to TO 8961 30 10 expand expand VB 8961 30 11 the the DT 8961 30 12 index index NN 8961 30 13 records record NNS 8961 30 14 with with IN 8961 30 15 additional additional JJ 8961 30 16 zeroes zero NNS 8961 30 17 added add VBN 8961 30 18 at at IN 8961 30 19 the the DT 8961 30 20 end end NN 8961 30 21 of of IN 8961 30 22 each each DT 8961 30 23 record record NN 8961 30 24 during during IN 8961 30 25 the the DT 8961 30 26 process process NN 8961 30 27 . . . 8961 31 1 Consequently consequently RB 8961 31 2 , , , 8961 31 3 unless unless IN 8961 31 4 Binary Binary NNP 8961 31 5 Vector Vector NNP 8961 31 6 / / SYM 8961 31 7 KING KING NNP 8961 31 8 309 309 CD 8961 31 9 there there EX 8961 31 10 are be VBP 8961 31 11 limitations limitation NNS 8961 31 12 of of IN 8961 31 13 size size NN 8961 31 14 imposed impose VBN 8961 31 15 by by IN 8961 31 16 the the DT 8961 31 17 equipment equipment NN 8961 31 18 , , , 8961 31 19 only only RB 8961 31 20 one one CD 8961 31 21 access access NN 8961 31 22 to to IN 8961 31 23 the the DT 8961 31 24 storage storage NN 8961 31 25 device device NN 8961 31 26 will will MD 8961 31 27 be be VB 8961 31 28 needed need VBN 8961 31 29 to to TO 8961 31 30 retrieve retrieve VB 8961 31 31 the the DT 8961 31 32 index index NN 8961 31 33 record record NN 8961 31 34 for for IN 8961 31 35 a a DT 8961 31 36 term term NN 8961 31 37 . . . 8961 32 1 The the DT 8961 32 2 second second JJ 8961 32 3 advantage advantage NN 8961 32 4 offered offer VBN 8961 32 5 by by IN 8961 32 6 the the DT 8961 32 7 binary binary JJ 8961 32 8 vector vector NN 8961 32 9 method method NN 8961 32 10 appears appear VBZ 8961 32 11 in in IN 8961 32 12 the the DT 8961 32 13 search search NN 8961 32 14 process process NN 8961 32 15 . . . 8961 33 1 Most Most JJS 8961 33 2 modern modern JJ 8961 33 3 computers computer NNS 8961 33 4 have have VBP 8961 33 5 a a DT 8961 33 6 built build VBN 8961 33 7 - - HYPH 8961 33 8 in in RP 8961 33 9 capability capability NN 8961 33 10 of of IN 8961 33 11 per- per- NN 8961 33 12 forming form VBG 8961 33 13 Boolean boolean JJ 8961 33 14 logical logical JJ 8961 33 15 manipulations manipulation NNS 8961 33 16 on on IN 8961 33 17 binary binary JJ 8961 33 18 digit digit NN 8961 33 19 vectors vector NNS 8961 33 20 or or CC 8961 33 21 strings string NNS 8961 33 22 . . . 8961 34 1 Thus thus RB 8961 34 2 , , , 8961 34 3 when when WRB 8961 34 4 Boolean boolean JJ 8961 34 5 operations operation NNS 8961 34 6 are be VBP 8961 34 7 specified specify VBN 8961 34 8 as as IN 8961 34 9 part part NN 8961 34 10 of of IN 8961 34 11 a a DT 8961 34 12 query query NN 8961 34 13 , , , 8961 34 14 the the DT 8961 34 15 imple- imple- NN 8961 34 16 mentation mentation NN 8961 34 17 of of IN 8961 34 18 the the DT 8961 34 19 operations operation NNS 8961 34 20 within within IN 8961 34 21 the the DT 8961 34 22 · · NFP 8961 34 23 computer computer NN 8961 34 24 is be VBZ 8961 34 25 considerably considerably RB 8961 34 26 easier easy JJR 8961 34 27 and and CC 8961 34 28 faster fast RBR 8961 34 29 for for IN 8961 34 30 binary binary JJ 8961 34 31 vectors vector NNS 8961 34 32 than than IN 8961 34 33 for for IN 8961 34 34 the the DT 8961 34 35 standard standard JJ 8961 34 36 form form NN 8961 34 37 of of IN 8961 34 38 inverted invert VBN 8961 34 39 files file NNS 8961 34 40 . . . 8961 35 1 Other other JJ 8961 35 2 investigators investigator NNS 8961 35 3 of of IN 8961 35 4 the the DT 8961 35 5 use use NN 8961 35 6 of of IN 8961 35 7 the the DT 8961 35 8 binary binary JJ 8961 35 9 digit digit NN 8961 35 10 patterns pattern NNS 8961 35 11 or or CC 8961 35 12 vectors vector NNS 8961 35 13 have have VBP 8961 35 14 not not RB 8961 35 15 fully fully RB 8961 35 16 explored explore VBN 8961 35 17 its -PRON- PRP$ 8961 35 18 advantages advantage NNS 8961 35 19 and and CC 8961 35 20 disadvantages disadvantage NNS 8961 35 21 . . . 8961 36 1 Bloom bloom NN 8961 36 2 suggests suggest VBZ 8961 36 3 , , , 8961 36 4 without without IN 8961 36 5 an an DT 8961 36 6 explanation explanation NN 8961 36 7 or or CC 8961 36 8 evaluation evaluation NN 8961 36 9 , , , 8961 36 10 the the DT 8961 36 11 use use NN 8961 36 12 of of IN 8961 36 13 bit bit NN 8961 36 14 patterns pattern NNS 8961 36 15 as as IN 8961 36 16 the the DT 8961 36 17 storage storage NN 8961 36 18 technique technique NN 8961 36 19 for for IN 8961 36 20 inverted inverted JJ 8961 36 21 files file NNS 8961 36 22 in in IN 8961 36 23 large large JJ 8961 36 24 data data NN 8961 36 25 bases basis NNS 8961 36 26 in in IN 8961 36 27 the the DT 8961 36 28 area area NN 8961 36 29 of of IN 8961 36 30 management management NN 8961 36 31 information information NN 8961 36 32 systems.1 systems.1 CD 8961 36 33 Davis Davis NNP 8961 36 34 and and CC 8961 36 35 Lin Lin NNP 8961 36 36 , , , 8961 36 37 again again RB 8961 36 38 in in IN 8961 36 39 the the DT 8961 36 40 area area NN 8961 36 41 of of IN 8961 36 42 management management NN 8961 36 43 in- in- NNP 8961 36 44 formation formation NN 8961 36 45 systems system NNS 8961 36 46 , , , 8961 36 47 propose propose VB 8961 36 48 bit bit NN 8961 36 49 patterns pattern NNS 8961 36 50 as as IN 8961 36 51 the the DT 8961 36 52 means mean NNS 8961 36 53 of of IN 8961 36 54 locating locate VBG 8961 36 55 pertinent pertinent JJ 8961 36 56 records record NNS 8961 36 57 in in IN 8961 36 58 a a DT 8961 36 59 master master NN 8961 36 60 file file NN 8961 36 61 . . . 8961 37 1 2 2 LS 8961 37 2 They -PRON- PRP 8961 37 3 do do VBP 8961 37 4 not not RB 8961 37 5 compare compare VB 8961 37 6 the the DT 8961 37 7 method method NN 8961 37 8 with with IN 8961 37 9 other other JJ 8961 37 10 pos- pos- JJ 8961 37 11 sible sible JJ 8961 37 12 techniques technique NNS 8961 37 13 . . . 8961 38 1 Sammon sammon JJ 8961 38 2 discusses discusse NNS 8961 38 3 briefly briefly RB 8961 38 4 the the DT 8961 38 5 use use NN 8961 38 6 of of IN 8961 38 7 binary binary JJ 8961 38 8 vectors vector NNS 8961 38 9 as as IN 8961 38 10 a a DT 8961 38 11 storage storage NN 8961 38 12 technique technique NN 8961 38 13 , , , 8961 38 14 but but CC 8961 38 15 dismisses dismiss VBZ 8961 38 16 it -PRON- PRP 8961 38 17 on on IN 8961 38 18 the the DT 8961 38 19 basis basis NN 8961 38 20 that that IN 8961 38 21 the the DT 8961 38 22 two two CD 8961 38 23 - - HYPH 8961 38 24 valued value VBN 8961 38 25 ap- ap- CD 8961 38 26 proach proach NN 8961 38 27 obviates obviate NNS 8961 38 28 the the DT 8961 38 29 possible possible JJ 8961 38 30 assignment assignment NN 8961 38 31 of of IN 8961 38 32 weights weight NNS 8961 38 33 to to IN 8961 38 34 index index NN 8961 38 35 terms term NNS 8961 38 36 in in IN 8961 38 37 de- de- RB 8961 38 38 scribing scribe VBG 8961 38 39 documents document NNS 8961 38 40 . . . 8961 39 1 3 3 CD 8961 39 2 Gorokhov Gorokhov NNP 8961 39 3 discusses discuss VBZ 8961 39 4 the the DT 8961 39 5 use use NN 8961 39 6 of of IN 8961 39 7 a a DT 8961 39 8 modified modify VBN 8961 39 9 binary binary NN 8961 39 10 vec- vec- NN 8961 39 11 tor tor NN 8961 39 12 approach approach NN 8961 39 13 in in IN 8961 39 14 a a DT 8961 39 15 document document JJ 8961 39 16 retrieval retrieval NN 8961 39 17 system system NN 8961 39 18 implemented implement VBN 8961 39 19 on on IN 8961 39 20 a a DT 8961 39 21 small small JJ 8961 39 22 Soviet soviet JJ 8961 39 23 computer.4 computer.4 NN 8961 39 24 Faced face VBN 8961 39 25 with with IN 8961 39 26 the the DT 8961 39 27 need need NN 8961 39 28 to to TO 8961 39 29 minimize minimize VB 8961 39 30 storage storage NN 8961 39 31 requirements requirement NNS 8961 39 32 for for IN 8961 39 33 his -PRON- PRP$ 8961 39 34 inverted invert VBN 8961 39 35 file file NN 8961 39 36 , , , 8961 39 37 Gorokhov Gorokhov NNP 8961 39 38 concentrated concentrate VBD 8961 39 39 on on IN 8961 39 40 developing develop VBG 8961 39 41 a a DT 8961 39 42 technique technique NN 8961 39 43 for for IN 8961 39 44 lo- lo- NNP 8961 39 45 cating cating NNP 8961 39 46 and and CC 8961 39 47 removing remove VBG 8961 39 48 strings string NNS 8961 39 49 of of IN 8961 39 50 zeroes zero NNS 8961 39 51 occurring occur VBG 8961 39 52 in in IN 8961 39 53 the the DT 8961 39 54 binary binary JJ 8961 39 55 vectors vector NNS 8961 39 56 used use VBN 8961 39 57 within within IN 8961 39 58 the the DT 8961 39 59 system system NN 8961 39 60 . . . 8961 40 1 Since since IN 8961 40 2 these these DT 8961 40 3 zeroes zero NNS 8961 40 4 represent represent VBP 8961 40 5 the the DT 8961 40 6 absence absence NN 8961 40 7 of of IN 8961 40 8 information information NN 8961 40 9 they -PRON- PRP 8961 40 10 could could MD 8961 40 11 be be VB 8961 40 12 removed remove VBN 8961 40 13 if if IN 8961 40 14 there there EX 8961 40 15 were be VBD 8961 40 16 a a DT 8961 40 17 way way NN 8961 40 18 to to TO 8961 40 19 indicate indicate VB 8961 40 20 the the DT 8961 40 21 position position NN 8961 40 22 in in IN 8961 40 23 the the DT 8961 40 24 original original JJ 8961 40 25 vector vector NN 8961 40 26 of of IN 8961 40 27 the the DT 8961 40 28 ones one NNS 8961 40 29 that that WDT 8961 40 30 remained remain VBD 8961 40 31 . . . 8961 41 1 He -PRON- PRP 8961 41 2 proposed propose VBD 8961 41 3 the the DT 8961 41 4 removal removal NN 8961 41 5 of of IN 8961 41 6 strings string NNS 8961 41 7 of of IN 8961 41 8 zeroes zero NNS 8961 41 9 and and CC 8961 41 10 the the DT 8961 41 11 inclusion inclusion NN 8961 41 12 of of IN 8961 41 13 numeric numeric JJ 8961 41 14 place place NN 8961 41 15 values value NNS 8961 41 16 with with IN 8961 41 17 the the DT 8961 41 18 re- re- JJ 8961 41 19 maining maining NN 8961 41 20 vector vector NN 8961 41 21 elements element NNS 8961 41 22 . . . 8961 42 1 His -PRON- PRP$ 8961 42 2 result result NN 8961 42 3 is be VBZ 8961 42 4 a a DT 8961 42 5 file file NN 8961 42 6 with with IN 8961 42 7 variable variable JJ 8961 42 8 - - HYPH 8961 42 9 length length NN 8961 42 10 index index NN 8961 42 11 rec- rec- NN 8961 42 12 ords ord NNS 8961 42 13 . . . 8961 43 1 The the DT 8961 43 2 abandoning abandoning NN 8961 43 3 of of IN 8961 43 4 the the DT 8961 43 5 pure pure JJ 8961 43 6 binary binary JJ 8961 43 7 vector vector NN 8961 43 8 obviates obviate VBZ 8961 43 9 the the DT 8961 43 10 process process NN 8961 43 11 , , , 8961 43 12 and and CC 8961 43 13 Gorokhov Gorokhov NNP 8961 43 14 found find VBD 8961 43 15 it -PRON- PRP 8961 43 16 necessary necessary JJ 8961 43 17 to to TO 8961 43 18 expand expand VB 8961 43 19 the the DT 8961 43 20 vector vector NN 8961 43 21 elements element NNS 8961 43 22 into into IN 8961 43 23 the the DT 8961 43 24 orig- orig- JJ 8961 43 25 inal inal JJ 8961 43 26 vector vector NN 8961 43 27 before before IN 8961 43 28 logical logical JJ 8961 43 29 operations operation NNS 8961 43 30 could could MD 8961 43 31 be be VB 8961 43 32 applied apply VBN 8961 43 33 . . . 8961 44 1 Even even RB 8961 44 2 though though IN 8961 44 3 he -PRON- PRP 8961 44 4 does do VBZ 8961 44 5 not not RB 8961 44 6 state state VB 8961 44 7 so so RB 8961 44 8 explicitly explicitly RB 8961 44 9 , , , 8961 44 10 Gorokhov Gorokhov NNP 8961 44 11 seems seem VBZ 8961 44 12 to to TO 8961 44 13 have have VB 8961 44 14 found find VBN 8961 44 15 his -PRON- PRP$ 8961 44 16 method method NN 8961 44 17 more more RBR 8961 44 18 efficient efficient JJ 8961 44 19 than than IN 8961 44 20 the the DT 8961 44 21 standard standard JJ 8961 44 22 inverted invert VBN 8961 44 23 file file NN 8961 44 24 . . . 8961 45 1 Gorokhov Gorokhov NNP 8961 45 2 ' ' POS 8961 45 3 s s POS 8961 45 4 suggestion suggestion NN 8961 45 5 has have VBZ 8961 45 6 led lead VBN 8961 45 7 to to IN 8961 45 8 the the DT 8961 45 9 development development NN 8961 45 10 of of IN 8961 45 11 an an DT 8961 45 12 algorithm algorithm NN 8961 45 13 for for IN 8961 45 14 the the DT 8961 45 15 compression compression NN 8961 45 16 of of IN 8961 45 17 binary binary JJ 8961 45 18 vectors vector NNS 8961 45 19 . . . 8961 46 1 Heaps Heaps NNP 8961 46 2 and and CC 8961 46 3 Thiel Thiel NNP 8961 46 4 have have VBP 8961 46 5 also also RB 8961 46 6 discussed discuss VBN 8961 46 7 the the DT 8961 46 8 use use NN 8961 46 9 of of IN 8961 46 10 compressed compress VBN 8961 46 11 binary binary JJ 8961 46 12 vec- vec- NN 8961 46 13 tors tor NNS 8961 46 14 as as IN 8961 46 15 the the DT 8961 46 16 basis basis NN 8961 46 17 of of IN 8961 46 18 an an DT 8961 46 19 inverted invert VBN 8961 46 20 index index NN 8961 46 21 file file NN 8961 46 22 . . . 8961 47 1 5• 5• CD 8961 47 2 6 6 CD 8961 47 3 Aside aside RB 8961 47 4 from from IN 8961 47 5 a a DT 8961 47 6 brief brief JJ 8961 47 7 descrip- descrip- NNS 8961 47 8 tion tion NN 8961 47 9 of of IN 8961 47 10 the the DT 8961 47 11 method method NN 8961 47 12 for for IN 8961 47 13 implementing implement VBG 8961 47 14 the the DT 8961 47 15 concept concept NN 8961 47 16 , , , 8961 47 17 they -PRON- PRP 8961 47 18 offer offer VBP 8961 47 19 no no DT 8961 47 20 compari- compari- JJ 8961 47 21 son son NN 8961 47 22 of of IN 8961 47 23 the the DT 8961 47 24 binary binary JJ 8961 47 25 vector vector NN 8961 47 26 with with IN 8961 47 27 the the DT 8961 47 28 standard standard JJ 8961 47 29 inverted invert VBN 8961 47 30 file file NN 8961 47 31 . . . 8961 48 1 STORAGE storage NN 8961 48 2 REQUIREMENTS requirement NNS 8961 48 3 An an DT 8961 48 4 immediate immediate JJ 8961 48 5 reaction reaction NN 8961 48 6 to to IN 8961 48 7 the the DT 8961 48 8 concept concept NN 8961 48 9 of of IN 8961 48 10 binary binary JJ 8961 48 11 vectors vector NNS 8961 48 12 is be VBZ 8961 48 13 to to TO 8961 48 14 state state VB 8961 48 15 that that IN 8961 48 16 they -PRON- PRP 8961 48 17 will will MD 8961 48 18 obviously obviously RB 8961 48 19 take take VB 8961 48 20 more more JJR 8961 48 21 storage storage NN 8961 48 22 space space NN 8961 48 23 than than IN 8961 48 24 the the DT 8961 48 25 standard standard JJ 8961 48 26 inverted invert VBN 8961 48 27 file file NN 8961 48 28 . . . 8961 49 1 A a DT 8961 49 2 closer close JJR 8961 49 3 study study NN 8961 49 4 shows show VBZ 8961 49 5 that that IN 8961 49 6 this this DT 8961 49 7 is be VBZ 8961 49 8 not not RB 8961 49 9 always always RB 8961 49 10 the the DT 8961 49 11 case case NN 8961 49 12 . . . 8961 50 1 The the DT 8961 50 2 storage storage NN 8961 50 3 require- require- VBD 8961 50 4 ments ment NNS 8961 50 5 for for IN 8961 50 6 the the DT 8961 50 7 two two CD 8961 50 8 types type NNS 8961 50 9 of of IN 8961 50 10 files file NNS 8961 50 11 may may MD 8961 50 12 be be VB 8961 50 13 calculated calculate VBN 8961 50 14 as as IN 8961 50 15 follows follow VBZ 8961 50 16 : : : 8961 50 17 310 310 CD 8961 50 18 Journal Journal NNP 8961 50 19 of of IN 8961 50 20 Library Library NNP 8961 50 21 Automation Automation NNP 8961 50 22 Vol Vol NNP 8961 50 23 . . . 8961 51 1 7/4 7/4 CD 8961 51 2 December December NNP 8961 51 3 1974 1974 CD 8961 51 4 D·N D·N NNP 8961 51 5 1 1 CD 8961 51 6 . . . 8961 52 1 MBv MBv NNS 8961 52 2 = = SYM 8961 52 3 8 8 CD 8961 52 4 bytes bytes NN 8961 52 5 2 2 CD 8961 52 6 . . . 8961 53 1 Msr msr JJ 8961 53 2 = = SYM 8961 53 3 D d NN 8961 53 4 · · NFP 8961 53 5 I -PRON- PRP 8961 53 6 · · NFP 8961 53 7 K K NNP 8961 53 8 where where WRB 8961 53 9 : : : 8961 53 10 ( ( -LRB- 8961 53 11 binary binary JJ 8961 53 12 vector vector NN 8961 53 13 file file NN 8961 53 14 ) ) -RRB- 8961 53 15 ( ( -LRB- 8961 53 16 standard standard JJ 8961 53 17 inverted invert VBN 8961 53 18 file file NN 8961 53 19 ) ) -RRB- 8961 53 20 M m NN 8961 53 21 = = SYM 8961 53 22 Storage storage NN 8961 53 23 requirements requirement NNS 8961 53 24 in in IN 8961 53 25 bytes bytes NN 8961 53 26 D d NN 8961 53 27 = = SYM 8961 53 28 Number number NN 8961 53 29 of of IN 8961 53 30 documents document NNS 8961 53 31 in in IN 8961 53 32 the the DT 8961 53 33 system system NN 8961 53 34 N n NN 8961 53 35 = = SYM 8961 53 36 Number number NN 8961 53 37 of of IN 8961 53 38 index index NN 8961 53 39 terms term NNS 8961 53 40 in in IN 8961 53 41 the the DT 8961 53 42 system system NN 8961 53 43 I -PRON- PRP 8961 53 44 = = NFP 8961 53 45 Average average JJ 8961 53 46 depth depth NN 8961 53 47 of of IN 8961 53 48 indexing indexing NN 8961 53 49 in in IN 8961 53 50 the the DT 8961 53 51 system system NN 8961 53 52 K k NN 8961 53 53 = = SYM 8961 53 54 Size size NN 8961 53 55 in in IN 8961 53 56 bytes byte NNS 8961 53 57 of of IN 8961 53 58 a a DT 8961 53 59 document document NN 8961 53 60 number number NN 8961 53 61 stored store VBN 8961 53 62 in in IN 8961 53 63 the the DT 8961 53 64 file file NN 8961 53 65 Using use VBG 8961 53 66 equations equation NNS 8961 53 67 1 1 CD 8961 53 68 and and CC 8961 53 69 2 2 CD 8961 53 70 we -PRON- PRP 8961 53 71 find find VBP 8961 53 72 that that IN 8961 53 73 the the DT 8961 53 74 storage storage NN 8961 53 75 requirements requirement NNS 8961 53 76 for for IN 8961 53 77 the the DT 8961 53 78 binary binary JJ 8961 53 79 vector vector NN 8961 53 80 file file NN 8961 53 81 are be VBP 8961 53 82 , , , 8961 53 83 in in IN 8961 53 84 fact fact NN 8961 53 85 , , , 8961 53 86 less less JJR 8961 53 87 than than IN 8961 53 88 the the DT 8961 53 89 requirements requirement NNS 8961 53 90 for for IN 8961 53 91 the the DT 8961 53 92 standard standard JJ 8961 53 93 inverted invert VBN 8961 53 94 file file NN 8961 53 95 if if IN 8961 53 96 N N NNP 8961 53 97 < < XX 8961 53 98 8 8 CD 8961 53 99 • • NNP 8961 53 100 ] ] -RRB- 8961 53 101 • • NNP 8961 53 102 K. K. NNP 8961 53 103 It -PRON- PRP 8961 53 104 is be VBZ 8961 53 105 well well RB 8961 53 106 lmown lmown VBN 8961 53 107 that that IN 8961 53 108 the the DT 8961 53 109 distribution distribution NN 8961 53 110 of of IN 8961 53 111 the the DT 8961 53 112 use use NN 8961 53 113 of of IN 8961 53 114 index index NN 8961 53 115 terms term NNS 8961 53 116 follows follow VBZ 8961 53 117 a a DT 8961 53 118 logarithmic logarithmic JJ 8961 53 119 curve curve NN 8961 53 120 . . . 8961 54 1 In in IN 8961 54 2 simple simple JJ 8961 54 3 terms term NNS 8961 54 4 , , , 8961 54 5 one one PRP 8961 54 6 might may MD 8961 54 7 say say VB 8961 54 8 that that IN 8961 54 9 a a DT 8961 54 10 few few JJ 8961 54 11 terms term NNS 8961 54 12 are be VBP 8961 54 13 used use VBN 8961 54 14 very very RB 8961 54 15 frequently frequently RB 8961 54 16 and and CC 8961 54 17 many many JJ 8961 54 18 terms term NNS 8961 54 19 are be VBP 8961 54 20 used use VBN 8961 54 21 infrequently infrequently RB 8961 54 22 . . . 8961 55 1 This this DT 8961 55 2 condi- condi- NN 8961 55 3 tion tion NN 8961 55 4 implies imply VBZ 8961 55 5 that that IN 8961 55 6 in in IN 8961 55 7 a a DT 8961 55 8 binary binary JJ 8961 55 9 vector vector NN 8961 55 10 file file NN 8961 55 11 the the DT 8961 55 12 records record NNS 8961 55 13 for for IN 8961 55 14 many many JJ 8961 55 15 terms term NNS 8961 55 16 will will MD 8961 55 17 contain contain VB 8961 55 18 segments segment NNS 8961 55 19 in in IN 8961 55 20 which which WDT 8961 55 21 there there EX 8961 55 22 are be VBP 8961 55 23 no no DT 8961 55 24 " " `` 8961 55 25 ones one NNS 8961 55 26 " " '' 8961 55 27 in in IN 8961 55 28 any any DT 8961 55 29 byte byte NN 8961 55 30 . . . 8961 56 1 A a DT 8961 56 2 method method NN 8961 56 3 for for IN 8961 56 4 re- re- RB 8961 56 5 moving move VBG 8961 56 6 these these DT 8961 56 7 " " `` 8961 56 8 zero zero CD 8961 56 9 " " '' 8961 56 10 bytes byte NNS 8961 56 11 is be VBZ 8961 56 12 called call VBN 8961 56 13 compression compression NN 8961 56 14 . . . 8961 57 1 COMPRESSION COMPRESSION VBN 8961 57 2 ALGORITHM ALGORITHM NNP 8961 57 3 The the DT 8961 57 4 technique technique NN 8961 57 5 for for IN 8961 57 6 the the DT 8961 57 7 compression compression NN 8961 57 8 of of IN 8961 57 9 binary binary JJ 8961 57 10 vectors vector NNS 8961 57 11 as as IN 8961 57 12 described describe VBN 8961 57 13 here here RB 8961 57 14 is be VBZ 8961 57 15 designed design VBN 8961 57 16 specifically specifically RB 8961 57 17 for for IN 8961 57 18 the the DT 8961 57 19 IBM IBM NNP 8961 57 20 360 360 CD 8961 57 21 family family NN 8961 57 22 of of IN 8961 57 23 computers computer NNS 8961 57 24 and and CC 8961 57 25 similar similar JJ 8961 57 26 machines machine NNS 8961 57 27 . . . 8961 58 1 The the DT 8961 58 2 extension extension NN 8961 58 3 to to IN 8961 58 4 other other JJ 8961 58 5 machines machine NNS 8961 58 6 should should MD 8961 58 7 be be VB 8961 58 8 obvious obvious JJ 8961 58 9 . . . 8961 59 1 Within within IN 8961 59 2 the the DT 8961 59 3 IBM IBM NNP 8961 59 4 360 360 CD 8961 59 5 the the DT 8961 59 6 byte byte NN 8961 59 7 , , , 8961 59 8 which which WDT 8961 59 9 contains contain VBZ 8961 59 10 eight eight CD 8961 59 11 binary binary NN 8961 59 12 digits digit NNS 8961 59 13 , , , 8961 59 14 is be VBZ 8961 59 15 the the DT 8961 59 16 basic basic JJ 8961 59 17 storage storage NN 8961 59 18 unit unit NN 8961 59 19 , , , 8961 59 20 and and CC 8961 59 21 with with IN 8961 59 22 the the DT 8961 59 23 eight eight CD 8961 59 24 binary binary NN 8961 59 25 digits digit NNS 8961 59 26 it -PRON- PRP 8961 59 27 is be VBZ 8961 59 28 possible possible JJ 8961 59 29 to to TO 8961 59 30 store store VB 8961 59 31 a a DT 8961 59 32 maximum maximum JJ 8961 59 33 integer integer NN 8961 59 34 value value NN 8961 59 35 of of IN 8961 59 36 255 255 CD 8961 59 37 . . . 8961 60 1 For for IN 8961 60 2 the the DT 8961 60 3 purpose purpose NN 8961 60 4 of of IN 8961 60 5 describing describe VBG 8961 60 6 a a DT 8961 60 7 proposed propose VBN 8961 60 8 compression compression NN 8961 60 9 algorithm algorithm NN 8961 60 10 for for IN 8961 60 11 the the DT 8961 60 12 binary binary JJ 8961 60 13 vector vector NN 8961 60 14 in in IN 8961 60 15 the the DT 8961 60 16 IBM IBM NNP 8961 60 17 360 360 CD 8961 60 18 , , , 8961 60 19 the the DT 8961 60 20 term term NN 8961 60 21 sub- sub- JJ 8961 60 22 vector vector NN 8961 60 23 will will MD 8961 60 24 be be VB 8961 60 25 defined define VBN 8961 60 26 as as IN 8961 60 27 a a DT 8961 60 28 string string NN 8961 60 29 of of IN 8961 60 30 contiguous contiguous JJ 8961 60 31 bytes byte NNS 8961 60 32 chosen choose VBN 8961 60 33 from from IN 8961 60 34 within within IN 8961 60 35 the the DT 8961 60 36 binary binary JJ 8961 60 37 vector vector NN 8961 60 38 . . . 8961 61 1 A a DT 8961 61 2 zero zero CD 8961 61 3 subvector subvector NN 8961 61 4 will will MD 8961 61 5 be be VB 8961 61 6 a a DT 8961 61 7 subvector subvector NN 8961 61 8 each each DT 8961 61 9 of of IN 8961 61 10 whose whose WP$ 8961 61 11 bytes byte NNS 8961 61 12 contains contain VBZ 8961 61 13 eight eight CD 8961 61 14 binary binary JJ 8961 61 15 zeroes zero NNS 8961 61 16 . . . 8961 62 1 A a DT 8961 62 2 nonzero nonzero NN 8961 62 3 subvecto1 subvecto1 NN 8961 62 4 · · NFP 8961 62 5 will will MD 8961 62 6 be be VB 8961 62 7 a a DT 8961 62 8 subvector subvector NN 8961 62 9 each each DT 8961 62 10 of of IN 8961 62 11 whose whose WP$ 8961 62 12 bytes byte NNS 8961 62 13 contains contain VBZ 8961 62 14 at at IN 8961 62 15 least least JJS 8961 62 16 one one CD 8961 62 17 binary binary JJ 8961 62 18 one one NN 8961 62 19 . . . 8961 63 1 To to TO 8961 63 2 compress compress VB 8961 63 3 a a DT 8961 63 4 binary binary JJ 8961 63 5 vec- vec- NN 8961 63 6 tor tor NN 8961 63 7 in in IN 8961 63 8 the the DT 8961 63 9 IBM IBM NNP 8961 63 10 360 360 CD 8961 63 11 the the DT 8961 63 12 following follow VBG 8961 63 13 steps step NNS 8961 63 14 may may MD 8961 63 15 be be VB 8961 63 16 taken take VBN 8961 63 17 : : : 8961 63 18 1 1 CD 8961 63 19 . . . 8961 64 1 Divide divide VB 8961 64 2 the the DT 8961 64 3 binary binary JJ 8961 64 4 vector vector NN 8961 64 5 into into IN 8961 64 6 a a DT 8961 64 7 series series NN 8961 64 8 of of IN 8961 64 9 zero zero CD 8961 64 10 subvectors subvector NNS 8961 64 11 and and CC 8961 64 12 nonzero nonzero NNP 8961 64 13 subvectors subvector NNS 8961 64 14 . . . 8961 65 1 Subvectors subvector NNS 8961 65 2 of of IN 8961 65 3 either either DT 8961 65 4 type type NN 8961 65 5 may may MD 8961 65 6 have have VB 8961 65 7 a a DT 8961 65 8 maximum maximum JJ 8961 65 9 length length NN 8961 65 10 of of IN 8961 65 11 255 255 CD 8961 65 12 bytes byte NNS 8961 65 13 . . . 8961 66 1 For for IN 8961 66 2 zero zero CD 8961 66 3 subvectors subvector NNS 8961 66 4 longer long JJR 8961 66 5 than than IN 8961 66 6 255 255 CD 8961 66 7 bytes byte NNS 8961 66 8 , , , 8961 66 9 the the DT 8961 66 10 256th 256th JJ 8961 66 11 byte byte NN 8961 66 12 is be VBZ 8961 66 13 to to TO 8961 66 14 be be VB 8961 66 15 treated treat VBN 8961 66 16 as as IN 8961 66 17 a a DT 8961 66 18 nonzero nonzero NN 8961 66 19 byte byte NN 8961 66 20 , , , 8961 66 21 thus thus RB 8961 66 22 dividing divide VBG 8961 66 23 the the DT 8961 66 24 long long JJ 8961 66 25 zero zero CD 8961 66 26 subvector subvector NN 8961 66 27 . . . 8961 67 1 2 2 LS 8961 67 2 . . . 8961 68 1 Each each DT 8961 68 2 nonzero nonzero NN 8961 68 3 subvector subvector NN 8961 68 4 is be VBZ 8961 68 5 prefixed prefix VBN 8961 68 6 with with IN 8961 68 7 two two CD 8961 68 8 bytes byte NNS 8961 68 9 . . . 8961 69 1 The the DT 8961 69 2 first first JJ 8961 69 3 of of IN 8961 69 4 the the DT 8961 69 5 prefix prefix JJ 8961 69 6 bytes byte NNS 8961 69 7 contains contain VBZ 8961 69 8 the the DT 8961 69 9 count count NN 8961 69 10 of of IN 8961 69 11 zero zero CD 8961 69 12 bytes byte NNS 8961 69 13 which which WDT 8961 69 14 precede precede VBP 8961 69 15 the the DT 8961 69 16 non- non- NNP 8961 69 17 zero zero CD 8961 69 18 subvector subvector NN 8961 69 19 in in IN 8961 69 20 the the DT 8961 69 21 uncompressed uncompressed JJ 8961 69 22 vector vector NN 8961 69 23 . . . 8961 70 1 The the DT 8961 70 2 second second JJ 8961 70 3 prefix prefix NN 8961 70 4 byte byte NN 8961 70 5 contains contain VBZ 8961 70 6 a a DT 8961 70 7 count count NN 8961 70 8 of of IN 8961 70 9 the the DT 8961 70 10 bytes byte NNS 8961 70 11 in in IN 8961 70 12 the the DT 8961 70 13 nonzero nonzero NN 8961 70 14 subvector subvector NN 8961 70 15 . . . 8961 71 1 3 3 LS 8961 71 2 . . . 8961 72 1 The the DT 8961 72 2 compressed compressed JJ 8961 72 3 vector vector NN 8961 72 4 then then RB 8961 72 5 consists consist VBZ 8961 72 6 of of IN 8961 72 7 only only RB 8961 72 8 the the DT 8961 72 9 nonzero nonzero NN 8961 72 10 subvectors subvector NNS 8961 72 11 together together RB 8961 72 12 with with IN 8961 72 13 their -PRON- PRP$ 8961 72 14 prefix prefix JJ 8961 72 15 bytes byte NNS 8961 72 16 . . . 8961 73 1 4 4 LS 8961 73 2 . . . 8961 74 1 A a DT 8961 74 2 two two CD 8961 74 3 byte byte NN 8961 74 4 field field NN 8961 74 5 of of IN 8961 74 6 binary binary JJ 8961 74 7 zeroes zero NNS 8961 74 8 will will MD 8961 74 9 end end VB 8961 74 10 the the DT 8961 74 11 compressed compressed JJ 8961 74 12 vector vector NN 8961 74 13 . . . 8961 75 1 Binmy Binmy NNP 8961 75 2 Vector Vector NNP 8961 75 3 / / SYM 8961 75 4 KING KING NNP 8961 75 5 311 311 CD 8961 75 6 The the DT 8961 75 7 compression compression NN 8961 75 8 of of IN 8961 75 9 the the DT 8961 75 10 vectors vector NNS 8961 75 11 creates create VBZ 8961 75 12 variable variable JJ 8961 75 13 - - HYPH 8961 75 14 length length NN 8961 75 15 records record NNS 8961 75 16 and and CC 8961 75 17 re- re- RB 8961 75 18 moves move VBZ 8961 75 19 the the DT 8961 75 20 advantage advantage NN 8961 75 21 of of IN 8961 75 22 having have VBG 8961 75 23 records record NNS 8961 75 24 which which WDT 8961 75 25 are be VBP 8961 75 26 directly directly RB 8961 75 27 amenable amenable JJ 8961 75 28 to to IN 8961 75 29 Boolean boolean JJ 8961 75 30 manipulation manipulation NN 8961 75 31 . . . 8961 76 1 The the DT 8961 76 2 effect effect NN 8961 76 3 of of IN 8961 76 4 file file NN 8961 76 5 compression compression NN 8961 76 6 on on IN 8961 76 7 such such JJ 8961 76 8 manipula- manipula- JJ 8961 76 9 tion tion NN 8961 76 10 in in IN 8961 76 11 the the DT 8961 76 12 search search NN 8961 76 13 process process NN 8961 76 14 is be VBZ 8961 76 15 not not RB 8961 76 16 as as RB 8961 76 17 severe severe JJ 8961 76 18 as as IN 8961 76 19 it -PRON- PRP 8961 76 20 might may MD 8961 76 21 appear appear VB 8961 76 22 . . . 8961 77 1 For for IN 8961 77 2 the the DT 8961 77 3 search search NN 8961 77 4 process process NN 8961 77 5 , , , 8961 77 6 the the DT 8961 77 7 compressed compressed JJ 8961 77 8 vector vector NN 8961 77 9 may may MD 8961 77 10 be be VB 8961 77 11 expanded expand VBN 8961 77 12 into into IN 8961 77 13 its -PRON- PRP$ 8961 77 14 original original JJ 8961 77 15 form form NN 8961 77 16 . . . 8961 78 1 The the DT 8961 78 2 process process NN 8961 78 3 of of IN 8961 78 4 expansion expansion NN 8961 78 5 of of IN 8961 78 6 the the DT 8961 78 7 binary binary JJ 8961 78 8 vectors vector NNS 8961 78 9 is be VBZ 8961 78 10 relatively relatively RB 8961 78 11 simple simple JJ 8961 78 12 , , , 8961 78 13 and and CC 8961 78 14 since since IN 8961 78 15 only only RB 8961 78 16 those those DT 8961 78 17 index index NN 8961 78 18 term term NN 8961 78 19 records record NNS 8961 78 20 which which WDT 8961 78 21 are be VBP 8961 78 22 used use VBN 8961 78 23 in in IN 8961 78 24 a a DT 8961 78 25 query query NN 8961 78 26 need need NN 8961 78 27 to to TO 8961 78 28 be be VB 8961 78 29 expanded expand VBN 8961 78 30 at at IN 8961 78 31 the the DT 8961 78 32 search search NN 8961 78 33 time time NN 8961 78 34 , , , 8961 78 35 the the DT 8961 78 36 search search NN 8961 78 37 time time NN 8961 78 38 is be VBZ 8961 78 39 not not RB 8961 78 40 significantly significantly RB 8961 78 41 affected affect VBN 8961 78 42 . . . 8961 79 1 As as IN 8961 79 2 an an DT 8961 79 3 example example NN 8961 79 4 of of IN 8961 79 5 the the DT 8961 79 6 use use NN 8961 79 7 of of IN 8961 79 8 the the DT 8961 79 9 compression compression NN 8961 79 10 algorithm algorithm NNP 8961 79 11 consider consider VBP 8961 79 12 the the DT 8961 79 13 fol- fol- NN 8961 79 14 lowing low VBG 8961 79 15 binary binary JJ 8961 79 16 vector vector NN 8961 79 17 . . . 8961 80 1 01100000/10000000/ 01100000/10000000/ NFP 8961 80 2 seven seven CD 8961 80 3 zero zero CD 8961 80 4 bytes bytes NN 8961 80 5 j00000001j10000000j j00000001j10000000j CD 8961 80 6 ... ... . 8961 81 1 The the DT 8961 81 2 slashes slash NNS 8961 81 3 indicate indicate VBP 8961 81 4 the the DT 8961 81 5 division division NN 8961 81 6 of of IN 8961 81 7 the the DT 8961 81 8 vector vector NN 8961 81 9 into into IN 8961 81 10 bytes byte NNS 8961 81 11 . . . 8961 82 1 The the DT 8961 82 2 vector vector NN 8961 82 3 might may MD 8961 82 4 be be VB 8961 82 5 read read VBN 8961 82 6 as as IN 8961 82 7 indicating indicate VBG 8961 82 8 the the DT 8961 82 9 following follow VBG 8961 82 10 list list NN 8961 82 11 of of IN 8961 82 12 document document NN 8961 82 13 numbers number NNS 8961 82 14 : : : 8961 82 15 2 2 CD 8961 82 16 , , , 8961 82 17 3 3 CD 8961 82 18 , , , 8961 82 19 9 9 CD 8961 82 20 , , , 8961 82 21 80 80 CD 8961 82 22 , , , 8961 82 23 and and CC 8961 82 24 81 81 CD 8961 82 25 . . . 8961 83 1 In in IN 8961 83 2 a a DT 8961 83 3 standard standard JJ 8961 83 4 inverted invert VBN 8961 83 5 file file NN 8961 83 6 with with IN 8961 83 7 each each DT 8961 83 8 document document NN 8961 83 9 number number NN 8961 83 10 assigned assign VBN 8961 83 11 three three CD 8961 83 12 bytes byte NNS 8961 83 13 of of IN 8961 83 14 storage storage NN 8961 83 15 , , , 8961 83 16 fifteen fifteen CD 8961 83 17 bytes byte NNS 8961 83 18 would would MD 8961 83 19 be be VB 8961 83 20 required require VBN 8961 83 21 to to TO 8961 83 22 store store VB 8961 83 23 these these DT 8961 83 24 numbers number NNS 8961 83 25 . . . 8961 84 1 The the DT 8961 84 2 compressed compressed JJ 8961 84 3 vector vector NN 8961 84 4 which which WDT 8961 84 5 results result VBZ 8961 84 6 from from IN 8961 84 7 the the DT 8961 84 8 application application NN 8961 84 9 of of IN 8961 84 10 the the DT 8961 84 11 algo- algo- JJ 8961 84 12 rithm rithm NN 8961 84 13 is be VBZ 8961 84 14 the the DT 8961 84 15 following follow VBG 8961 84 16 : : : 8961 84 17 00000000j00000010j01100000/10000000j00000111/00000010/ 00000000j00000010j01100000/10000000j00000111/00000010/ CD 8961 84 18 00000001/10000000/ 00000001/10000000/ CD 8961 84 19 ... ... : 8961 84 20 Again again RB 8961 84 21 the the DT 8961 84 22 slashes slash NNS 8961 84 23 separate separate VBP 8961 84 24 the the DT 8961 84 25 vector vector NN 8961 84 26 into into IN 8961 84 27 bytes byte NNS 8961 84 28 . . . 8961 85 1 For for IN 8961 85 2 the the DT 8961 85 3 purpose purpose NN 8961 85 4 of of IN 8961 85 5 the the DT 8961 85 6 fol- fol- NN 8961 85 7 lowing low VBG 8961 85 8 discussion discussion NN 8961 85 9 consider consider VB 8961 85 10 each each DT 8961 85 11 byte byte NN 8961 85 12 in in IN 8961 85 13 a a DT 8961 85 14 vector vector NN 8961 85 15 to to TO 8961 85 16 be be VB 8961 85 17 numbered number VBN 8961 85 18 sequential- sequential- NN 8961 85 19 ly ly XX 8961 85 20 beginning begin VBG 8961 85 21 with with IN 8961 85 22 byte byte NN 8961 85 23 one one CD 8961 85 24 at at IN 8961 85 25 the the DT 8961 85 26 left left NN 8961 85 27 . . . 8961 86 1 In in IN 8961 86 2 the the DT 8961 86 3 uncompressed uncompressed JJ 8961 86 4 vector vector NN 8961 86 5 bytes byte VBZ 8961 86 6 one one CD 8961 86 7 and and CC 8961 86 8 two two CD 8961 86 9 form form NN 8961 86 10 a a DT 8961 86 11 nonzero nonzero NN 8961 86 12 subvector subvector NN 8961 86 13 . . . 8961 87 1 Consequently consequently RB 8961 87 2 , , , 8961 87 3 the the DT 8961 87 4 first first JJ 8961 87 5 four four CD 8961 87 6 bytes byte NNS 8961 87 7 in in IN 8961 87 8 the the DT 8961 87 9 compressed compress VBN 8961 87 10 vector vector NN 8961 87 11 can can MD 8961 87 12 be be VB 8961 87 13 inter- inter- NN 8961 87 14 preted prete VBN 8961 87 15 as as IN 8961 87 16 follows follow VBZ 8961 87 17 : : : 8961 87 18 Byte byte VB 8961 87 19 one one CD 8961 87 20 . . . 8961 88 1 Binary binary JJ 8961 88 2 zero zero CD 8961 88 3 indicating indicate VBG 8961 88 4 that that IN 8961 88 5 no no DT 8961 88 6 zero zero CD 8961 88 7 bytes byte NNS 8961 88 8 were be VBD 8961 88 9 re- re- RB 8961 88 10 moved move VBN 8961 88 11 preceding precede VBG 8961 88 12 this this DT 8961 88 13 subvector subvector NN 8961 88 14 . . . 8961 89 1 Byte byte NN 8961 89 2 two two CD 8961 89 3 . . . 8961 90 1 Binary binary JJ 8961 90 2 two two CD 8961 90 3 indicating indicate VBG 8961 90 4 that that IN 8961 90 5 the the DT 8961 90 6 following follow VBG 8961 90 7 nonzero nonzero CD 8961 90 8 sub- sub- JJ 8961 90 9 vector vector NN 8961 90 10 is be VBZ 8961 90 11 two two CD 8961 90 12 bytes byte NNS 8961 90 13 long long JJ 8961 90 14 . . . 8961 91 1 Bytes byte NNS 8961 91 2 three three CD 8961 91 3 , , , 8961 91 4 four four CD 8961 91 5 . . . 8961 92 1 Bytes byte NNS 8961 92 2 one one CD 8961 92 3 and and CC 8961 92 4 two two CD 8961 92 5 of of IN 8961 92 6 the the DT 8961 92 7 original original JJ 8961 92 8 vector vector NN 8961 92 9 . . . 8961 93 1 Bytes byte NNS 8961 93 2 three three CD 8961 93 3 through through IN 8961 93 4 nine nine CD 8961 93 5 of of IN 8961 93 6 the the DT 8961 93 7 original original JJ 8961 93 8 vector vector NN 8961 93 9 are be VBP 8961 93 10 a a DT 8961 93 11 zero zero CD 8961 93 12 subvector subvector NN 8961 93 13 , , , 8961 93 14 and and CC 8961 93 15 bytes byte VBZ 8961 93 16 ten ten CD 8961 93 17 and and CC 8961 93 18 eleven eleven CD 8961 93 19 form form NN 8961 93 20 a a DT 8961 93 21 second second JJ 8961 93 22 nonzero nonzero NN 8961 93 23 subvector subvector NN 8961 93 24 . . . 8961 94 1 Consequently consequently RB 8961 94 2 , , , 8961 94 3 the the DT 8961 94 4 second second JJ 8961 94 5 four four CD 8961 94 6 bytes byte NNS 8961 94 7 of of IN 8961 94 8 the the DT 8961 94 9 compressed compressed JJ 8961 94 10 vector vector NN 8961 94 11 are be VBP 8961 94 12 interpreted interpret VBN 8961 94 13 as as IN 8961 94 14 follows follow VBZ 8961 94 15 : : : 8961 94 16 Byte byte NN 8961 94 17 five five CD 8961 94 18 . . . 8961 95 1 Binary binary JJ 8961 95 2 seven seven CD 8961 95 3 indicating indicate VBG 8961 95 4 that that IN 8961 95 5 a a DT 8961 95 6 zero zero CD 8961 95 7 subvector subvector NN 8961 95 8 of of IN 8961 95 9 seven seven CD 8961 95 10 bytes byte NNS 8961 95 11 has have VBZ 8961 95 12 been be VBN 8961 95 13 removed remove VBN 8961 95 14 . . . 8961 96 1 Byte byte NN 8961 96 2 six six CD 8961 96 3 . . . 8961 97 1 Binary binary JJ 8961 97 2 two two CD 8961 97 3 indicating indicate VBG 8961 97 4 that that IN 8961 97 5 the the DT 8961 97 6 following follow VBG 8961 97 7 two two CD 8961 97 8 bytes byte NNS 8961 97 9 are be VBP 8961 97 10 a a DT 8961 97 11 nonzero nonzero NN 8961 97 12 subvector subvector NN 8961 97 13 . . . 8961 98 1 Bytes byte NNS 8961 98 2 seven seven CD 8961 98 3 , , , 8961 98 4 eight eight CD 8961 98 5 . . . 8961 99 1 Bytes byte NNS 8961 99 2 ten ten CD 8961 99 3 and and CC 8961 99 4 eleven eleven CD 8961 99 5 of of IN 8961 99 6 the the DT 8961 99 7 original original JJ 8961 99 8 vector vector NN 8961 99 9 . . . 8961 100 1 Thus thus RB 8961 100 2 the the DT 8961 100 3 binary binary JJ 8961 100 4 vector vector NN 8961 100 5 has have VBZ 8961 100 6 been be VBN 8961 100 7 reduced reduce VBN 8961 100 8 from from IN 8961 100 9 eleven eleven CD 8961 100 10 bytes byte NNS 8961 100 11 to to IN 8961 100 12 eight eight CD 8961 100 13 312 312 CD 8961 100 14 Journal Journal NNP 8961 100 15 of of IN 8961 100 16 Library Library NNP 8961 100 17 Automation Automation NNP 8961 100 18 Vol Vol NNP 8961 100 19 . . . 8961 101 1 7/4 7/4 CD 8961 101 2 December December NNP 8961 101 3 1974 1974 CD 8961 101 4 bytes byte NNS 8961 101 5 while while IN 8961 101 6 the the DT 8961 101 7 space space NN 8961 101 8 required require VBN 8961 101 9 to to TO 8961 101 10 record record VB 8961 101 11 the the DT 8961 101 12 document document NN 8961 101 13 numbers number NNS 8961 101 14 in in IN 8961 101 15 the the DT 8961 101 16 stan- stan- NNP 8961 101 17 dard dard NNP 8961 101 18 inverted invert VBN 8961 101 19 file file NN 8961 101 20 remains remain VBZ 8961 101 21 fifteen fifteen CD 8961 101 22 bytes byte NNS 8961 101 23 . . . 8961 102 1 MEMORY memory NN 8961 102 2 REQUIREMENTS requirement NNS 8961 102 3 FOR for IN 8961 102 4 THE the DT 8961 102 5 STANDARD STANDARD NNP 8961 102 6 INVERTED INVERTED NNP 8961 102 7 FILE FILE NNP 8961 102 8 AND and CC 8961 102 9 THE the DT 8961 102 10 BINARY BINARY NNP 8961 102 11 VECTOR vector NN 8961 102 12 FILE FILE VBD 8961 102 13 To to TO 8961 102 14 compare compare VB 8961 102 15 memory memory NN 8961 102 16 requirements requirement NNS 8961 102 17 for for IN 8961 102 18 the the DT 8961 102 19 standard standard JJ 8961 102 20 inverted invert VBN 8961 102 21 file file NN 8961 102 22 and and CC 8961 102 23 the the DT 8961 102 24 compressed compressed JJ 8961 102 25 binary binary JJ 8961 102 26 vector vector NN 8961 102 27 file file NN 8961 102 28 , , , 8961 102 29 we -PRON- PRP 8961 102 30 base base VBP 8961 102 31 our -PRON- PRP$ 8961 102 32 comparison comparison NN 8961 102 33 on on IN 8961 102 34 the the DT 8961 102 35 total total JJ 8961 102 36 number number NN 8961 102 37 of of IN 8961 102 38 postings posting NNS 8961 102 39 in in IN 8961 102 40 the the DT 8961 102 41 file file NN 8961 102 42 . . . 8961 103 1 In in IN 8961 103 2 the the DT 8961 103 3 standard standard JJ 8961 103 4 inverted invert VBN 8961 103 5 file file NN 8961 103 6 the the DT 8961 103 7 storage storage NN 8961 103 8 space space NN 8961 103 9 for for IN 8961 103 10 the the DT 8961 103 11 postings posting NNS 8961 103 12 is be VBZ 8961 103 13 equal equal JJ 8961 103 14 to to IN 8961 103 15 the the DT 8961 103 16 number number NN 8961 103 17 of of IN 8961 103 18 postings posting NNS 8961 103 19 times time NNS 8961 103 20 the the DT 8961 103 21 length length NN 8961 103 22 of of IN 8961 103 23 a a DT 8961 103 24 sin- sin- RB 8961 103 25 gle gle JJ 8961 103 26 posting posting NN 8961 103 27 , , , 8961 103 28 which which WDT 8961 103 29 is be VBZ 8961 103 30 usually usually RB 8961 103 31 two two CD 8961 103 32 , , , 8961 103 33 three three CD 8961 103 34 , , , 8961 103 35 or or CC 8961 103 36 five five CD 8961 103 37 bytes byte NNS 8961 103 38 . . . 8961 104 1 Memory memory NN 8961 104 2 requirements requirement NNS 8961 104 3 for for IN 8961 104 4 the the DT 8961 104 5 compressed compressed JJ 8961 104 6 binary binary JJ 8961 104 7 vector vector NN 8961 104 8 file file NN 8961 104 9 are be VBP 8961 104 10 more more RBR 8961 104 11 difficult difficult JJ 8961 104 12 to to TO 8961 104 13 estimate estimate VB 8961 104 14 because because IN 8961 104 15 the the DT 8961 104 16 distribution distribution NN 8961 104 17 of of IN 8961 104 18 document document NN 8961 104 19 numbers number NNS 8961 104 20 within within IN 8961 104 21 the the DT 8961 104 22 record record NN 8961 104 23 for for IN 8961 104 24 each each DT 8961 104 25 index index NN 8961 104 26 term term NN 8961 104 27 is be VBZ 8961 104 28 not not RB 8961 104 29 known know VBN 8961 104 30 . . . 8961 105 1 The the DT 8961 105 2 fact fact NN 8961 105 3 that that IN 8961 105 4 a a DT 8961 105 5 single single JJ 8961 105 6 byte byte NN 8961 105 7 in in IN 8961 105 8 the the DT 8961 105 9 binary binary JJ 8961 105 10 vector vector NN 8961 105 11 file file NN 8961 105 12 may may MD 8961 105 13 contain contain VB 8961 105 14 between between IN 8961 105 15 zero zero CD 8961 105 16 and and CC 8961 105 17 eight eight CD 8961 105 18 postings posting NNS 8961 105 19 is be VBZ 8961 105 20 extremely extremely RB 8961 105 21 important important JJ 8961 105 22 . . . 8961 106 1 The the DT 8961 106 2 worst bad JJS 8961 106 3 possible possible JJ 8961 106 4 case case NN 8961 106 5 occurs occur VBZ 8961 106 6 if if IN 8961 106 7 the the DT 8961 106 8 postings posting NNS 8961 106 9 in in IN 8961 106 10 the the DT 8961 106 11 binary binary JJ 8961 106 12 vector vector NN 8961 106 13 are be VBP 8961 106 14 spaced space VBN 8961 106 15 in in IN 8961 106 16 such such PDT 8961 106 17 a a DT 8961 106 18 way way NN 8961 106 19 that that IN 8961 106 20 each each DT 8961 106 21 nonzero nonzero NN 8961 106 22 byte byte NN 8961 106 23 contains contain VBZ 8961 106 24 only only RB 8961 106 25 one one CD 8961 106 26 posting posting NN 8961 106 27 , , , 8961 106 28 and and CC 8961 106 29 these these DT 8961 106 30 bytes byte NNS 8961 106 31 are be VBP 8961 106 32 separated separate VBN 8961 106 33 by by IN 8961 106 34 zero zero CD 8961 106 35 bytes byte NNS 8961 106 36 . . . 8961 107 1 Consider consider VB 8961 107 2 the the DT 8961 107 3 following follow VBG 8961 107 4 example example NN 8961 107 5 : : : 8961 107 6 ... ... NFP 8961 107 7 /00000000/00010000/00000000/00000100/ /00000000/00010000/00000000/00000100/ NFP 8961 107 8 ... ... . 8961 108 1 In in IN 8961 108 2 this this DT 8961 108 3 case case NN 8961 108 4 the the DT 8961 108 5 compression compression NN 8961 108 6 algorithm algorithm NNP 8961 108 7 will will MD 8961 108 8 remove remove VB 8961 108 9 the the DT 8961 108 10 zero zero CD 8961 108 11 bytes byte NNS 8961 108 12 , , , 8961 108 13 but but CC 8961 108 14 will will MD 8961 108 15 add add VB 8961 108 16 two two CD 8961 108 17 bytes byte NNS 8961 108 18 ( ( -LRB- 8961 108 19 the the DT 8961 108 20 prefix prefix JJ 8961 108 21 bytes byte VBZ 8961 108 22 ) ) -RRB- 8961 108 23 for for IN 8961 108 24 each each DT 8961 108 25 nonzero nonzero NN 8961 108 26 byte byte NN 8961 108 27 . . . 8961 109 1 The the DT 8961 109 2 resulting result VBG 8961 109 3 com- com- NN 8961 109 4 pressed press VBN 8961 109 5 vector vector NN 8961 109 6 will will MD 8961 109 7 be be VB 8961 109 8 essentially essentially RB 8961 109 9 the the DT 8961 109 10 same same JJ 8961 109 11 length length NN 8961 109 12 as as IN 8961 109 13 the the DT 8961 109 14 standard standard JJ 8961 109 15 inverted invert VBN 8961 109 16 file file NN 8961 109 17 record record NN 8961 109 18 if if IN 8961 109 19 each each DT 8961 109 20 posting post VBG 8961 109 21 is be VBZ 8961 109 22 three three CD 8961 109 23 bytes byte NNS 8961 109 24 long long JJ 8961 109 25 in in IN 8961 109 26 the the DT 8961 109 27 standard standard JJ 8961 109 28 inverted invert VBN 8961 109 29 file file NN 8961 109 30 . . . 8961 110 1 It -PRON- PRP 8961 110 2 might may MD 8961 110 3 seem seem VB 8961 110 4 that that IN 8961 110 5 the the DT 8961 110 6 distribution distribution NN 8961 110 7 of of IN 8961 110 8 one one CD 8961 110 9 posting post VBG 8961 110 10 per per IN 8961 110 11 byte byte NN 8961 110 12 for for IN 8961 110 13 the the DT 8961 110 14 entire entire JJ 8961 110 15 vector vector NN 8961 110 16 represents represent VBZ 8961 110 17 an an DT 8961 110 18 even even RB 8961 110 19 worse bad JJR 8961 110 20 situation situation NN 8961 110 21 . . . 8961 111 1 It -PRON- PRP 8961 111 2 is be VBZ 8961 111 3 clear clear JJ 8961 111 4 that that IN 8961 111 5 the the DT 8961 111 6 compression compression NN 8961 111 7 algorithm algorithm NNP 8961 111 8 will will MD 8961 111 9 , , , 8961 111 10 in in IN 8961 111 11 this this DT 8961 111 12 case case NN 8961 111 13 , , , 8961 111 14 not not RB 8961 111 15 reduce reduce VB 8961 111 16 the the DT 8961 111 17 size size NN 8961 111 18 of of IN 8961 111 19 the the DT 8961 111 20 vector vector NN 8961 111 21 . . . 8961 112 1 However however RB 8961 112 2 , , , 8961 112 3 it -PRON- PRP 8961 112 4 must must MD 8961 112 5 be be VB 8961 112 6 remembered remember VBN 8961 112 7 that that IN 8961 112 8 in in IN 8961 112 9 the the DT 8961 112 10 standard standard JJ 8961 112 11 inverted invert VBN 8961 112 12 file file NN 8961 112 13 each each DT 8961 112 14 posting posting NN 8961 112 15 will will MD 8961 112 16 re- re- RB 8961 112 17 quire quire VB 8961 112 18 at at RB 8961 112 19 least least RBS 8961 112 20 two two CD 8961 112 21 bytes byte NNS 8961 112 22 and and CC 8961 112 23 perhaps perhaps RB 8961 112 24 three three CD 8961 112 25 bytes byte NNS 8961 112 26 . . . 8961 113 1 Thus thus RB 8961 113 2 , , , 8961 113 3 the the DT 8961 113 4 length length NN 8961 113 5 of of IN 8961 113 6 the the DT 8961 113 7 record record NN 8961 113 8 in in IN 8961 113 9 the the DT 8961 113 10 standard standard JJ 8961 113 11 inverted invert VBN 8961 113 12 file file NN 8961 113 13 is be VBZ 8961 113 14 two two CD 8961 113 15 or or CC 8961 113 16 three three CD 8961 113 17 times time NNS 8961 113 18 longer long JJR 8961 113 19 than than IN 8961 113 20 the the DT 8961 113 21 corresponding correspond VBG 8961 113 22 binary binary JJ 8961 113 23 vector vector NN 8961 113 24 regardless regardless RB 8961 113 25 of of IN 8961 113 26 compression compression NN 8961 113 27 . . . 8961 114 1 In in IN 8961 114 2 data datum NNS 8961 114 3 used use VBN 8961 114 4 in in IN 8961 114 5 two two CD 8961 114 6 model model NN 8961 114 7 retrieval retrieval NN 8961 114 8 systems system NNS 8961 114 9 prepared prepare VBD 8961 114 10 to to TO 8961 114 11 compare compare VB 8961 114 12 the the DT 8961 114 13 standard standard JJ 8961 114 14 inverted invert VBN 8961 114 15 file file NN 8961 114 16 and and CC 8961 114 17 the the DT 8961 114 18 binary binary JJ 8961 114 19 vector vector NN 8961 114 20 file file NN 8961 114 21 there there EX 8961 114 22 are be VBP 8961 114 23 6,121 6,121 CD 8961 114 24 documents document NNS 8961 114 25 with with IN 8961 114 26 a a DT 8961 114 27 total total NN 8961 114 28 of of IN 8961 114 29 94,542 94,542 CD 8961 114 30 postings posting NNS 8961 114 31 . . . 8961 115 1 An an DT 8961 115 2 examination examination NN 8961 115 3 of of IN 8961 115 4 the the DT 8961 115 5 binary binary JJ 8961 115 6 inverted invert VBN 8961 115 7 file file NN 8961 115 8 for for IN 8961 115 9 the the DT 8961 115 10 model model NN 8961 115 11 systems system NNS 8961 115 12 discloses disclose VBZ 8961 115 13 that that IN 8961 115 14 there there EX 8961 115 15 are be VBP 8961 115 16 only only RB 8961 115 17 55,311 55,311 CD 8961 115 18 nonzero nonzero NN 8961 115 19 bytes byte NNS 8961 115 20 in in IN 8961 115 21 the the DT 8961 115 22 binary binary JJ 8961 115 23 vector vector NN 8961 115 24 file file NN 8961 115 25 . . . 8961 116 1 Thus thus RB 8961 116 2 there there EX 8961 116 3 seems seem VBZ 8961 116 4 to to TO 8961 116 5 be be VB 8961 116 6 some some DT 8961 116 7 form form NN 8961 116 8 of of IN 8961 116 9 clustering clustering NN 8961 116 10 of of IN 8961 116 11 the the DT 8961 116 12 document document NN 8961 116 13 numbers number NNS 8961 116 14 in in IN 8961 116 15 each each DT 8961 116 16 index index NN 8961 116 17 term term NN 8961 116 18 record record NN 8961 116 19 . . . 8961 117 1 If if IN 8961 117 2 each each DT 8961 117 3 nonzero nonzero NN 8961 117 4 byte byte NN 8961 117 5 in in IN 8961 117 6 this this DT 8961 117 7 binary binary JJ 8961 117 8 vector vector NN 8961 117 9 is be VBZ 8961 117 10 isolated isolate VBN 8961 117 11 by by IN 8961 117 12 zero zero CD 8961 117 13 bytes byte NNS 8961 117 14 , , , 8961 117 15 two two CD 8961 117 16 prefix prefix JJ 8961 117 17 bytes byte NNS 8961 117 18 would would MD 8961 117 19 be be VB 8961 117 20 added add VBN 8961 117 21 for for IN 8961 117 22 each each DT 8961 117 23 byte byte NN 8961 117 24 . . . 8961 118 1 Thus thus RB 8961 118 2 the the DT 8961 118 3 total total JJ 8961 118 4 memory memory NN 8961 118 5 requirements requirement NNS 8961 118 6 for for IN 8961 118 7 the the DT 8961 118 8 postings posting NNS 8961 118 9 in in IN 8961 118 10 the the DT 8961 118 11 compressed compress VBN 8961 118 12 file file NN 8961 118 13 would would MD 8961 118 14 be be VB 8961 118 15 165,933 165,933 CD 8961 118 16 bytes byte NNS 8961 118 17 . . . 8961 119 1 Less Less JJR 8961 119 2 storage storage NN 8961 119 3 space space NN 8961 119 4 is be VBZ 8961 119 5 required require VBN 8961 119 6 if if IN 8961 119 7 some some DT 8961 119 8 nonzero nonzero NN 8961 119 9 bytes byte NNS 8961 119 10 are be VBP 8961 119 11 contiguous contiguous JJ 8961 119 12 . . . 8961 120 1 On on IN 8961 120 2 the the DT 8961 120 3 other other JJ 8961 120 4 hand hand NN 8961 120 5 , , , 8961 120 6 the the DT 8961 120 7 standard standard NN 8961 120 8 in- in- NNP 8961 120 9 verted verte VBN 8961 120 10 file file NN 8961 120 11 will will MD 8961 120 12 require require VB 8961 120 13 189,084 189,084 CD 8961 120 14 bytes byte NNS 8961 120 15 if if IN 8961 120 16 a a DT 8961 120 17 two two CD 8961 120 18 - - HYPH 8961 120 19 byte byte NN 8961 120 20 posting posting NN 8961 120 21 is be VBZ 8961 120 22 used use VBN 8961 120 23 , , , 8961 120 24 or or CC 8961 120 25 283,626 283,626 CD 8961 120 26 bytes byte NNS 8961 120 27 if if IN 8961 120 28 a a DT 8961 120 29 three three CD 8961 120 30 - - HYPH 8961 120 31 byte byte NN 8961 120 32 posting posting NN 8961 120 33 is be VBZ 8961 120 34 used use VBN 8961 120 35 . . . 8961 121 1 Further further JJ 8961 121 2 study study NN 8961 121 3 of of IN 8961 121 4 the the DT 8961 121 5 cluster- cluster- XX 8961 121 6 ing ing NNP 8961 121 7 phenomenon phenomenon NN 8961 121 8 is be VBZ 8961 121 9 needed need VBN 8961 121 10 . . . 8961 122 1 Binary Binary NNP 8961 122 2 Vector Vector NNP 8961 122 3 /KING /KING . 8961 122 4 313 313 CD 8961 122 5 MODEL MODEL NNP 8961 122 6 RETRIEVAL RETRIEVAL NNP 8961 122 7 SYSTEMS systems NN 8961 122 8 To to TO 8961 122 9 test test VB 8961 122 10 some some DT 8961 122 11 of of IN 8961 122 12 the the DT 8961 122 13 conjectures conjecture NNS 8961 122 14 about about IN 8961 122 15 the the DT 8961 122 16 differences difference NNS 8961 122 17 between between IN 8961 122 18 the the DT 8961 122 19 stan- stan- NNP 8961 122 20 dard dard NNP 8961 122 21 inverted invert VBN 8961 122 22 file file NN 8961 122 23 and and CC 8961 122 24 the the DT 8961 122 25 binary binary JJ 8961 122 26 vector vector NN 8961 122 27 file file NN 8961 122 28 , , , 8961 122 29 two two CD 8961 122 30 model model NN 8961 122 31 systems system NNS 8961 122 32 were be VBD 8961 122 33 pre- pre- RB 8961 122 34 pared pare VBN 8961 122 35 for for IN 8961 122 36 operation operation NN 8961 122 37 on on IN 8961 122 38 an an DT 8961 122 39 IBM IBM NNP 8961 122 40 360/67 360/67 CD 8961 122 41 . . . 8961 123 1 Details detail NNS 8961 123 2 of of IN 8961 123 3 the the DT 8961 123 4 systems system NNS 8961 123 5 and and CC 8961 123 6 PL/1 PL/1 NNP 8961 123 7 program program NN 8961 123 8 listings listing NNS 8961 123 9 are be VBP 8961 123 10 available available JJ 8961 123 11 elsewhere elsewhere RB 8961 123 12 . . . 8961 124 1 7 7 CD 8961 124 2 The the DT 8961 124 3 data datum NNS 8961 124 4 base base NN 8961 124 5 used use VBN 8961 124 6 was be VBD 8961 124 7 obtained obtain VBN 8961 124 8 from from IN 8961 124 9 the the DT 8961 124 10 Institute Institute NNP 8961 124 11 of of IN 8961 124 12 Animal Animal NNP 8961 124 13 Behavior Behavior NNP 8961 124 14 at at IN 8961 124 15 Rutgers Rutgers NNP 8961 124 16 University University NNP 8961 124 17 . . . 8961 125 1 In in IN 8961 125 2 the the DT 8961 125 3 data datum NNS 8961 125 4 base base NN 8961 125 5 6,121 6,121 CD 8961 125 6 documents document NNS 8961 125 7 were be VBD 8961 125 8 indexed index VBN 8961 125 9 by by IN 8961 125 10 1,484 1,484 CD 8961 125 11 index index NN 8961 125 12 terms term NNS 8961 125 13 . . . 8961 126 1 A a DT 8961 126 2 total total NN 8961 126 3 of of IN 8961 126 4 94,542 94,542 CD 8961 126 5 postings posting NNS 8961 126 6 in in IN 8961 126 7 the the DT 8961 126 8 system system NN 8961 126 9 gives give VBZ 8961 126 10 an an DT 8961 126 11 average average JJ 8961 126 12 depth depth NN 8961 126 13 of of IN 8961 126 14 indexing indexing NN 8961 126 15 of of IN 8961 126 16 15.4 15.4 CD 8961 126 17 terms term NNS 8961 126 18 per per IN 8961 126 19 document document NN 8961 126 20 . . . 8961 127 1 Both both DT 8961 127 2 inverted invert VBN 8961 127 3 files file NNS 8961 127 4 were be VBD 8961 127 5 stored store VBN 8961 127 6 on on IN 8961 127 7 IBM IBM NNP 8961 127 8 2314 2314 CD 8961 127 9 disc disc NN 8961 127 10 storage storage NN 8961 127 11 devices device NNS 8961 127 12 . . . 8961 128 1 To to TO 8961 128 2 ease ease VB 8961 128 3 the the DT 8961 128 4 problem problem NN 8961 128 5 of of IN 8961 128 6 handling handle VBG 8961 128 7 variable variable JJ 8961 128 8 - - HYPH 8961 128 9 length length NN 8961 128 10 records record NNS 8961 128 11 in in IN 8961 128 12 both both DT 8961 128 13 files file NNS 8961 128 14 the the DT 8961 128 15 logical logical JJ 8961 128 16 records record NNS 8961 128 17 for for IN 8961 128 18 each each DT 8961 128 19 index index NN 8961 128 20 term term NN 8961 128 21 were be VBD 8961 128 22 divided divide VBN 8961 128 23 into into IN 8961 128 24 chains chain NNS 8961 128 25 of of IN 8961 128 26 fixed~ fixed~ NNP 8961 128 27 lehgth lehgth JJ 8961 128 28 physical physical JJ 8961 128 29 records record NNS 8961 128 30 . . . 8961 129 1 For for IN 8961 129 2 the the DT 8961 129 3 standard standard JJ 8961 129 4 inverted invert VBN 8961 129 5 file file NN 8961 129 6 a a DT 8961 129 7 physical physical JJ 8961 129 8 record record NN 8961 129 9 size size NN 8961 129 10 of of IN 8961 129 11 331 331 CD 8961 129 12 bytes byte NNS 8961 129 13 was be VBD 8961 129 14 chosen choose VBN 8961 129 15 . . . 8961 130 1 The the DT 8961 130 2 entire entire JJ 8961 130 3 file file NN 8961 130 4 required require VBD 8961 130 5 702,713 702,713 CD 8961 130 6 bytes byte NNS 8961 130 7 including include VBG 8961 130 8 record record NN 8961 130 9 overhead overhead RB 8961 130 10 . . . 8961 131 1 For for IN 8961 131 2 the the DT 8961 131 3 uncompressed uncompressed JJ 8961 131 4 binary binary JJ 8961 131 5 vector vector NN 8961 131 6 file file NN 8961 131 7 a a DT 8961 131 8 physical physical JJ 8961 131 9 record record NN 8961 131 10 size size NN 8961 131 11 of of IN 8961 131 12 1,286 1,286 CD 8961 131 13 bytes byte NNS 8961 131 14 was be VBD 8961 131 15 chosen choose VBN 8961 131 16 to to TO 8961 131 17 include include VB 8961 131 18 overhead overhead NN 8961 131 19 and and CC 8961 131 20 space space VB 8961 131 21 for for IN 8961 131 22 up up IN 8961 131 23 to to TO 8961 131 24 10,216 10,216 CD 8961 131 25 document document NN 8961 131 26 numbers number NNS 8961 131 27 . . . 8961 132 1 When when WRB 8961 132 2 the the DT 8961 132 3 compression compression NN 8961 132 4 algorithm algorithm NNP 8961 132 5 was be VBD 8961 132 6 applied apply VBN 8961 132 7 , , , 8961 132 8 with with IN 8961 132 9 a a DT 8961 132 10 physical physical JJ 8961 132 11 record record NN 8961 132 12 length length NN 8961 132 13 of of IN 8961 132 14 130 130 CD 8961 132 15 bytes byte NNS 8961 132 16 , , , 8961 132 17 the the DT 8961 132 18 memory memory NN 8961 132 19 requirements requirement NNS 8961 132 20 for for IN 8961 132 21 the the DT 8961 132 22 binary binary JJ 8961 132 23 vector vector NN 8961 132 24 file file NN 8961 132 25 were be VBD 8961 132 26 reduced reduce VBN 8961 132 27 to to IN 8961 132 28 281,450 281,450 CD 8961 132 29 bytes byte NNS 8961 132 30 , , , 8961 132 31 or or CC 8961 132 32 41 41 CD 8961 132 33 percent percent NN 8961 132 34 of of IN 8961 132 35 the the DT 8961 132 36 space space NN 8961 132 37 required require VBN 8961 132 38 to to TO 8961 132 39 store store VB 8961 132 40 the the DT 8961 132 41 standard standard JJ 8961 132 42 inverted invert VBN 8961 132 43 file file NN 8961 132 44 . . . 8961 133 1 A a DT 8961 133 2 series series NN 8961 133 3 of of IN 8961 133 4 forty forty CD 8961 133 5 searches search NNS 8961 133 6 of of IN 8961 133 7 varying varying NN 8961 133 8 complexities complexity NNS 8961 133 9 were be VBD 8961 133 10 run run VBN 8961 133 11 against against IN 8961 133 12 both both DT 8961 133 13 files file NNS 8961 133 14 . . . 8961 134 1 The the DT 8961 134 2 " " `` 8961 134 3 TIME time NN 8961 134 4 " " '' 8961 134 5 function function NN 8961 134 6 of of IN 8961 134 7 PL/1 PL/1 NNP 8961 134 8 made make VBD 8961 134 9 it -PRON- PRP 8961 134 10 possible possible JJ 8961 134 11 to to TO 8961 134 12 accumulate accumulate VB 8961 134 13 tim- tim- NNS 8961 134 14 ing ing NNP 8961 134 15 statistics statistic NNS 8961 134 16 which which WDT 8961 134 17 excluded exclude VBD 8961 134 18 input input NN 8961 134 19 / / SYM 8961 134 20 output output NN 8961 134 21 functions function NNS 8961 134 22 . . . 8961 135 1 Search search NN 8961 135 2 times time NNS 8961 135 3 for for IN 8961 135 4 the the DT 8961 135 5 binary binary JJ 8961 135 6 vector vector NN 8961 135 7 file file NN 8961 135 8 include include VBP 8961 135 9 expansion expansion NN 8961 135 10 of of IN 8961 135 11 the the DT 8961 135 12 compressed compress VBN 8961 135 13 vectors vector NNS 8961 135 14 , , , 8961 135 15 Boolean boolean JJ 8961 135 16 manipulation manipulation NN 8961 135 17 of of IN 8961 135 18 the the DT 8961 135 19 vectors vector NNS 8961 135 20 , , , 8961 135 21 and and CC 8961 135 22 conversion conversion NN 8961 135 23 of of IN 8961 135 24 the the DT 8961 135 25 resultant resultant JJ 8961 135 26 vector vector NN 8961 135 27 into into IN 8961 135 28 digital digital JJ 8961 135 29 document document NN 8961 135 30 numbers number NNS 8961 135 31 . . . 8961 136 1 The the DT 8961 136 2 times time NNS 8961 136 3 for for IN 8961 136 4 the the DT 8961 136 5 standard standard JJ 8961 136 6 inverted invert VBN 8961 136 7 file file NN 8961 136 8 are be VBP 8961 136 9 for for IN 8961 136 10 the the DT 8961 136 11 Boolean boolean JJ 8961 136 12 manipulation manipulation NN 8961 136 13 of of IN 8961 136 14 the the DT 8961 136 15 lists list NNS 8961 136 16 . . . 8961 137 1 The the DT 8961 137 2 following follow VBG 8961 137 3 points point NNS 8961 137 4 were be VBD 8961 137 5 noted note VBN 8961 137 6 in in IN 8961 137 7 the the DT 8961 137 8 analysis analysis NN 8961 137 9 of of IN 8961 137 10 the the DT 8961 137 11 times time NNS 8961 137 12 : : : 8961 137 13 1 1 CD 8961 137 14 . . . 8961 138 1 In in IN 8961 138 2 twenty twenty CD 8961 138 3 - - HYPH 8961 138 4 two two CD 8961 138 5 of of IN 8961 138 6 the the DT 8961 138 7 forty forty CD 8961 138 8 queries query NNS 8961 138 9 for for IN 8961 138 10 which which WDT 8961 138 11 comparative comparative JJ 8961 138 12 timings timing NNS 8961 138 13 were be VBD 8961 138 14 obtained obtain VBN 8961 138 15 , , , 8961 138 16 the the DT 8961 138 17 search search NN 8961 138 18 of of IN 8961 138 19 the the DT 8961 138 20 binary binary JJ 8961 138 21 vector vector NN 8961 138 22 file file NN 8961 138 23 was be VBD 8961 138 24 faster fast RBR 8961 138 25 , , , 8961 138 26 in in IN 8961 138 27 one one CD 8961 138 28 case case NN 8961 138 29 by by IN 8961 138 30 a a DT 8961 138 31 factor factor NN 8961 138 32 of of IN 8961 138 33 thirty thirty CD 8961 138 34 - - HYPH 8961 138 35 five five CD 8961 138 36 . . . 8961 139 1 In in IN 8961 139 2 the the DT 8961 139 3 eighteen eighteen CD 8961 139 4 cases case NNS 8961 139 5 in in IN 8961 139 6 which which WDT 8961 139 7 the the DT 8961 139 8 search search NN 8961 139 9 of of IN 8961 139 10 the the DT 8961 139 11 standard standard JJ 8961 139 12 inverted invert VBN 8961 139 13 file file NN 8961 139 14 was be VBD 8961 139 15 faster fast JJR 8961 139 16 , , , 8961 139 17 the the DT 8961 139 18 search search NN 8961 139 19 of of IN 8961 139 20 the the DT 8961 139 21 stan- stan- NNP 8961 139 22 dard dard NNP 8961 139 23 inverted invert VBN 8961 139 24 file file NN 8961 139 25 was be VBD 8961 139 26 at at IN 8961 139 27 most most JJS 8961 139 28 6.17 6.17 CD 8961 139 29 times time NNS 8961 139 30 faster fast RBR 8961 139 31 . . . 8961 140 1 2 2 LS 8961 140 2 . . . 8961 141 1 The the DT 8961 141 2 range range NN 8961 141 3 of of IN 8961 141 4 the the DT 8961 141 5 total total JJ 8961 141 6 times time NNS 8961 141 7 for for IN 8961 141 8 the the DT 8961 141 9 binary binary JJ 8961 141 10 vector vector NN 8961 141 11 file file NN 8961 141 12 was be VBD 8961 141 13 .79 .79 CD 8961 141 14 seconds second NNS 8961 141 15 to to IN 8961 141 16 9.72 9.72 CD 8961 141 17 seconds second NNS 8961 141 18 . . . 8961 142 1 The the DT 8961 142 2 range range NN 8961 142 3 for for IN 8961 142 4 searching search VBG 8961 142 5 the the DT 8961 142 6 standard standard JJ 8961 142 7 inverted invert VBN 8961 142 8 file file NN 8961 142 9 was be VBD 8961 142 10 .15 .15 CD 8961 142 11 seconds second NNS 8961 142 12 to to IN 8961 142 13 202.98 202.98 CD 8961 142 14 seconds second NNS 8961 142 15 . . . 8961 143 1 The the DT 8961 143 2 fact fact NN 8961 143 3 that that IN 8961 143 4 the the DT 8961 143 5 search search NN 8961 143 6 times time NNS 8961 143 7 for for IN 8961 143 8 the the DT 8961 143 9 binary binary JJ 8961 143 10 vector vector NN 8961 143 11 file file NN 8961 143 12 are be VBP 8961 143 13 within within IN 8961 143 14 a a DT 8961 143 15 fairly fairly RB 8961 143 16 narrow narrow JJ 8961 143 17 range range NN 8961 143 18 , , , 8961 143 19 in in IN 8961 143 20 contrast contrast NN 8961 143 21 to to IN 8961 143 22 the the DT 8961 143 23 wider wide JJR 8961 143 24 range range NN 8961 143 25 of of IN 8961 143 26 times time NNS 8961 143 27 for for IN 8961 143 28 searching search VBG 8961 143 29 the the DT 8961 143 30 standard standard JJ 8961 143 31 inverted invert VBN 8961 143 32 file file NN 8961 143 33 , , , 8961 143 34 has have VBZ 8961 143 35 im- im- DT 8961 143 36 portant portant JJ 8961 143 37 implications implication NNS 8961 143 38 for for IN 8961 143 39 the the DT 8961 143 40 design design NN 8961 143 41 of of IN 8961 143 42 an an DT 8961 143 43 on on IN 8961 143 44 - - HYPH 8961 143 45 line line NN 8961 143 46 interactive interactive JJ 8961 143 47 docu- docu- NN 8961 143 48 ment ment JJ 8961 143 49 retrieval retrieval NN 8961 143 50 system system NN 8961 143 51 . . . 8961 144 1 In in IN 8961 144 2 such such PDT 8961 144 3 a a DT 8961 144 4 system system NN 8961 144 5 it -PRON- PRP 8961 144 6 is be VBZ 8961 144 7 important important JJ 8961 144 8 that that IN 8961 144 9 the the DT 8961 144 10 com- com- NN 8961 144 11 puter puter NN 8961 144 12 respond respond NN 8961 144 13 to to IN 8961 144 14 users user NNS 8961 144 15 ' ' POS 8961 144 16 requests request NNS 8961 144 17 not not RB 8961 144 18 only only RB 8961 144 19 rapidly rapidly RB 8961 144 20 but but CC 8961 144 21 consistently consistently RB 8961 144 22 . . . 8961 145 1 The the DT 8961 145 2 narrower narrow JJR 8961 145 3 range range NN 8961 145 4 of of IN 8961 145 5 the the DT 8961 145 6 search search NN 8961 145 7 times time NNS 8961 145 8 provided provide VBN 8961 145 9 by by IN 8961 145 10 the the DT 8961 145 11 binary binary JJ 8961 145 12 vector vector NN 8961 145 13 file file NN 8961 145 14 will will MD 8961 145 15 assist assist VB 8961 145 16 in in IN 8961 145 17 producing produce VBG 8961 145 18 consistent consistent JJ 8961 145 19 times time NNS 8961 145 20 . . . 8961 146 1 3 3 LS 8961 146 2 . . . 8961 147 1 The the DT 8961 147 2 search search NN 8961 147 3 times time NNS 8961 147 4 for for IN 8961 147 5 the the DT 8961 147 6 binary binary JJ 8961 147 7 vector vector NN 8961 147 8 file file NN 8961 147 9 , , , 8961 147 10 exclusive exclusive JJ 8961 147 11 of of IN 8961 147 12 expansion expansion NN 8961 147 13 and and CC 8961 147 14 conversion conversion NN 8961 147 15 times time NNS 8961 147 16 , , , 8961 147 17 are be VBP 8961 147 18 unaffected unaffected JJ 8961 147 19 by by IN 8961 147 20 the the DT 8961 147 21 number number NN 8961 147 22 of of IN 8961 147 23 postings posting NNS 8961 147 24 con- con- NN 8961 147 25 314 314 CD 8961 147 26 Journal Journal NNP 8961 147 27 of of IN 8961 147 28 Library Library NNP 8961 147 29 Automation Automation NNP 8961 147 30 Vol Vol NNP 8961 147 31 . . . 8961 148 1 7/4 7/4 CD 8961 148 2 December December NNP 8961 148 3 1974 1974 CD 8961 148 4 tained taine VBD 8961 148 5 in in IN 8961 148 6 the the DT 8961 148 7 index index NN 8961 148 8 terms term NNS 8961 148 9 used use VBN 8961 148 10 in in IN 8961 148 11 a a DT 8961 148 12 query query NN 8961 148 13 . . . 8961 149 1 On on IN 8961 149 2 the the DT 8961 149 3 other other JJ 8961 149 4 hand hand NN 8961 149 5 , , , 8961 149 6 the the DT 8961 149 7 number number NN 8961 149 8 of of IN 8961 149 9 postings posting NNS 8961 149 10 in in IN 8961 149 11 the the DT 8961 149 12 records record NNS 8961 149 13 used use VBN 8961 149 14 from from IN 8961 149 15 the the DT 8961 149 16 standard standard JJ 8961 149 17 inverted invert VBN 8961 149 18 file file NN 8961 149 19 appears appear VBZ 8961 149 20 to to TO 8961 149 21 cause cause VB 8961 149 22 the the DT 8961 149 23 differences difference NNS 8961 149 24 in in IN 8961 149 25 search search NN 8961 149 26 times time NNS 8961 149 27 for for IN 8961 149 28 that that DT 8961 149 29 file file NN 8961 149 30 . . . 8961 150 1 To to TO 8961 150 2 test test VB 8961 150 3 the the DT 8961 150 4 conjectures conjecture NNS 8961 150 5 ! ! . 8961 151 1 that that IN 8961 151 2 1 1 CD 8961 151 3 . . . 8961 151 4 search search NN 8961 151 5 times time NNS 8961 151 6 for for IN 8961 151 7 the the DT 8961 151 8 binary binary JJ 8961 151 9 vector vector NN 8961 151 10 file file NN 8961 151 11 are be VBP 8961 151 12 related relate VBN 8961 151 13 to to IN 8961 151 14 the the DT 8961 151 15 number number NN 8961 151 16 of of IN 8961 151 17 index index NN 8961 151 18 terms term NNS 8961 151 19 in in IN 8961 151 20 the the DT 8961 151 21 query query NN 8961 151 22 , , , 8961 151 23 and and CC 8961 151 24 2 2 CD 8961 151 25 . . . 8961 151 26 search search NN 8961 151 27 times time NNS 8961 151 28 for for IN 8961 151 29 the the DT 8961 151 30 standard standard JJ 8961 151 31 inverted invert VBN 8961 151 32 file file NN 8961 151 33 are be VBP 8961 151 34 related relate VBN 8961 151 35 to to IN 8961 151 36 the the DT 8961 151 37 num- num- NNP 8961 151 38 ber ber NN 8961 151 39 of of IN 8961 151 40 postings posting NNS 8961 151 41 in in IN 8961 151 42 the the DT 8961 151 43 index index NN 8961 151 44 terms term NNS 8961 151 45 in in IN 8961 151 46 the the DT 8961 151 47 query query NN 8961 151 48 , , , 8961 151 49 a a DT 8961 151 50 correlation correlation NN 8961 151 51 analysis analysis NN 8961 151 52 was be VBD 8961 151 53 performed perform VBN 8961 151 54 . . . 8961 152 1 The the DT 8961 152 2 following following JJ 8961 152 3 correlation correlation NN 8961 152 4 co- co- JJ 8961 152 5 efficients efficient NNS 8961 152 6 were be VBD 8961 152 7 obtained obtain VBN 8961 152 8 : : : 8961 152 9 V v NN 8961 152 10 a1'iables a1'iable NNS 8961 152 11 1 1 CD 8961 152 12 ' ' '' 8961 152 13 Number number NN 8961 152 14 of of IN 8961 152 15 terms term NNS 8961 152 16 in in IN 8961 152 17 query query NN 8961 152 18 and and CC 8961 152 19 search search NN 8961 152 20 .960 .960 CD 8961 152 21 times time NNS 8961 152 22 for for IN 8961 152 23 the the DT 8961 152 24 binary binary JJ 8961 152 25 vector vector NN 8961 152 26 file file NN 8961 152 27 . . . 8961 153 1 Number number NN 8961 153 2 of of IN 8961 153 3 postings posting NNS 8961 153 4 in in IN 8961 153 5 query query NN 8961 153 6 terms term NNS 8961 153 7 and and CC 8961 153 8 .979 .979 CD 8961 153 9 search search NN 8961 153 10 times time NNS 8961 153 11 for for IN 8961 153 12 standard standard JJ 8961 153 13 inverted invert VBN 8961 153 14 file file NN 8961 153 15 . . . 8961 154 1 The the DT 8961 154 2 relationships relationship NNS 8961 154 3 indicated indicate VBD 8961 154 4 above above RB 8961 154 5 are be VBP 8961 154 6 significant significant JJ 8961 154 7 at at IN 8961 154 8 the the DT 8961 154 9 .001 .001 CD 8961 154 10 level level NN 8961 154 11 . . . 8961 155 1 No no DT 8961 155 2 attempt attempt NN 8961 155 3 was be VBD 8961 155 4 made make VBN 8961 155 5 to to TO 8961 155 6 compute compute VB 8961 155 7 an an DT 8961 155 8 average average JJ 8961 155 9 search search NN 8961 155 10 time time NN 8961 155 11 per per IN 8961 155 12 term term NN 8961 155 13 for for IN 8961 155 14 the the DT 8961 155 15 binary binary JJ 8961 155 16 vector vector NN 8961 155 17 file file NN 8961 155 18 or or CC 8961 155 19 average average JJ 8961 155 20 search search NN 8961 155 21 time time NN 8961 155 22 per per IN 8961 155 23 posting post VBG 8961 155 24 for for IN 8961 155 25 the the DT 8961 155 26 standard standard JJ 8961 155 27 in- in- NNP 8961 155 28 verted verted JJ 8961 155 29 file file NN 8961 155 30 . . . 8961 156 1 Such such JJ 8961 156 2 times time NNS 8961 156 3 would would MD 8961 156 4 have have VB 8961 156 5 meaning meaning NN 8961 156 6 only only RB 8961 156 7 for for IN 8961 156 8 the the DT 8961 156 9 model model NN 8961 156 10 systems system NNS 8961 156 11 . . . 8961 157 1 SUMMARY summary VB 8961 157 2 The the DT 8961 157 3 binary binary JJ 8961 157 4 vector vector NN 8961 157 5 is be VBZ 8961 157 6 suggested suggest VBN 8961 157 7 as as IN 8961 157 8 an an DT 8961 157 9 alternative alternative NN 8961 157 10 to to IN 8961 157 11 the the DT 8961 157 12 usual usual JJ 8961 157 13 method method NN 8961 157 14 of of IN 8961 157 15 storing store VBG 8961 157 16 document document NN 8961 157 17 pointers pointer NNS 8961 157 18 in in IN 8961 157 19 an an DT 8961 157 20 inverted invert VBN 8961 157 21 index index NN 8961 157 22 file file NN 8961 157 23 . . . 8961 158 1 The the DT 8961 158 2 binary binary JJ 8961 158 3 vector vector NN 8961 158 4 file file NN 8961 158 5 can can MD 8961 158 6 provide provide VB 8961 158 7 savings saving NNS 8961 158 8 in in IN 8961 158 9 storage storage NN 8961 158 10 space space NN 8961 158 11 , , , 8961 158 12 search search NN 8961 158 13 times time NNS 8961 158 14 , , , 8961 158 15 and and CC 8961 158 16 programming programming NN 8961 158 17 effort effort NN 8961 158 18 . . . 8961 159 1 REFERENCES reference NNS 8961 159 2 1 1 CD 8961 159 3 . . . 8961 160 1 Burton Burton NNP 8961 160 2 H. H. NNP 8961 160 3 Bloom Bloom NNP 8961 160 4 , , , 8961 160 5 " " '' 8961 160 6 Some some DT 8961 160 7 Techniques Techniques NNPS 8961 160 8 and and CC 8961 160 9 Trade Trade NNP 8961 160 10 - - HYPH 8961 160 11 Offs Offs NNP 8961 160 12 Affecting Affecting NNP 8961 160 13 Large Large NNP 8961 160 14 Data Data NNP 8961 160 15 Base Base NNP 8961 160 16 Retrieval Retrieval NNP 8961 160 17 Times Times NNP 8961 160 18 , , , 8961 160 19 " " `` 8961 160 20 Proceedings proceeding NNS 8961 160 21 of of IN 8961 160 22 the the DT 8961 160 23 ACM ACM NNP 8961 160 24 24 24 CD 8961 160 25 ( ( -LRB- 8961 160 26 1969 1969 CD 8961 160 27 ) ) -RRB- 8961 160 28 . . . 8961 161 1 2 2 LS 8961 161 2 . . . 8961 162 1 D. D. NNP 8961 162 2 R. R. NNP 8961 162 3 Davis Davis NNP 8961 162 4 and and CC 8961 162 5 A. A. NNP 8961 162 6 D. D. NNP 8961 162 7 Lin Lin NNP 8961 162 8 , , , 8961 162 9 " " `` 8961 162 10 Secondary Secondary NNP 8961 162 11 Key Key NNP 8961 162 12 Retrieval Retrieval NNP 8961 162 13 Using use VBG 8961 162 14 an an DT 8961 162 15 IBM IBM NNP 8961 162 16 7090 7090 CD 8961 162 17 - - HYPH 8961 162 18 1310 1310 CD 8961 162 19 System system NN 8961 162 20 , , , 8961 162 21 " " '' 8961 162 22 Communications communication NNS 8961 162 23 of of IN 8961 162 24 the the DT 8961 162 25 ACM ACM NNP 8961 162 26 8:243 8:243 CD 8961 162 27 - - SYM 8961 162 28 46 46 CD 8961 162 29 ( ( -LRB- 8961 162 30 April1965 April1965 NNP 8961 162 31 ) ) -RRB- 8961 162 32 . . . 8961 163 1 3 3 LS 8961 163 2 . . . 8961 164 1 John John NNP 8961 164 2 W. W. NNP 8961 164 3 Sammon Sammon NNP 8961 164 4 , , , 8961 164 5 Some some DT 8961 164 6 Mathematics Mathematics NNPS 8961 164 7 of of IN 8961 164 8 Information Information NNP 8961 164 9 Storage Storage NNP 8961 164 10 and and CC 8961 164 11 Retrieval Retrieval NNP 8961 164 12 ( ( -LRB- 8961 164 13 Tech- Tech- NNP 8961 164 14 nical nical JJ 8961 164 15 Report Report NNP 8961 164 16 RADC RADC NNP 8961 164 17 - - HYPH 8961 164 18 Tr-68 tr-68 WP 8961 164 19 - - HYPH 8961 164 20 178 178 CD 8961 164 21 [ [ -LRB- 8961 164 22 Rome Rome NNP 8961 164 23 , , , 8961 164 24 New New NNP 8961 164 25 York York NNP 8961 164 26 : : : 8961 164 27 Rome Rome NNP 8961 164 28 Air Air NNP 8961 164 29 Development Development NNP 8961 164 30 Center Center NNP 8961 164 31 , , , 8961 164 32 1968 1968 CD 8961 164 33 ] ] -RRB- 8961 164 34 ) ) -RRB- 8961 164 35 . . . 8961 165 1 4 4 LS 8961 165 2 . . . 8961 166 1 S. S. NNP 8961 166 2 A. A. NNP 8961 166 3 Gorokhov Gorokhov NNP 8961 166 4 , , , 8961 166 5 " " `` 8961 166 6 The the DT 8961 166 7 ' ' `` 8961 166 8 Setka-3 Setka-3 NNP 8961 166 9 ' ' '' 8961 166 10 Automated Automated NNP 8961 166 11 IRS IRS NNP 8961 166 12 on on IN 8961 166 13 the the DT 8961 166 14 ' ' `` 8961 166 15 Minsk-22 minsk-22 NN 8961 166 16 ' ' '' 8961 166 17 with with IN 8961 166 18 the the DT 8961 166 19 Use Use NNP 8961 166 20 of of IN 8961 166 21 the the DT 8961 166 22 Socket Socket NNP 8961 166 23 Associative Associative NNP 8961 166 24 - - HYPH 8961 166 25 Address Address NNP 8961 166 26 Method Method NNP 8961 166 27 of of IN 8961 166 28 Organization Organization NNP 8961 166 29 of of IN 8961 166 30 Information Information NNP 8961 166 31 " " '' 8961 166 32 ( ( -LRB- 8961 166 33 Paper paper NN 8961 166 34 presented present VBD 8961 166 35 at at IN 8961 166 36 the the DT 8961 166 37 All All NNP 8961 166 38 - - HYPH 8961 166 39 Union Union NNP 8961 166 40 Conference Conference NNP 8961 166 41 on on IN 8961 166 42 Information Information NNP 8961 166 43 Retrieval Retrieval NNP 8961 166 44 Systems Systems NNP 8961 166 45 and and CC 8961 166 46 Auto- Auto- NNPS 8961 166 47 matic matic JJ 8961 166 48 Processing Processing NNP 8961 166 49 of of IN 8961 166 50 Scientific Scientific NNP 8961 166 51 and and CC 8961 166 52 Technical Technical NNP 8961 166 53 Information Information NNP 8961 166 54 , , , 8961 166 55 Moscow Moscow NNP 8961 166 56 , , , 8961 166 57 1967 1967 CD 8961 166 58 . . . 8961 167 1 Translated translate VBN 8961 167 2 and and CC 8961 167 3 published publish VBN 8961 167 4 as as IN 8961 167 5 part part NN 8961 167 6 of of IN 8961 167 7 AD ad NN 8961 167 8 697 697 CD 8961 167 9 687 687 CD 8961 167 10 , , , 8961 167 11 National National NNP 8961 167 12 Technical Technical NNP 8961 167 13 Information Information NNP 8961 167 14 Service Service NNP 8961 167 15 ) ) -RRB- 8961 167 16 . . . 8961 168 1 5 5 CD 8961 168 2 . . . 8961 169 1 H. H. NNP 8961 169 2 S. S. NNP 8961 169 3 Heaps Heaps NNP 8961 169 4 and and CC 8961 169 5 L. L. NNP 8961 169 6 H. H. NNP 8961 169 7 Thiel Thiel NNP 8961 169 8 , , , 8961 169 9 " " `` 8961 169 10 Optimum optimum JJ 8961 169 11 Procedures procedure NNS 8961 169 12 for for IN 8961 169 13 Economic Economic NNP 8961 169 14 Information Information NNP 8961 169 15 Re- Re- NNP 8961 169 16 trieval trieval NN 8961 169 17 , , , 8961 169 18 " " `` 8961 169 19 Information Information NNP 8961 169 20 Storage Storage NNP 8961 169 21 & & CC 8961 169 22 Retrieval6:131 Retrieval6:131 NNP 8961 169 23 - - HYPH 8961 169 24 53 53 CD 8961 169 25 ( ( -LRB- 8961 169 26 1970 1970 CD 8961 169 27 ) ) -RRB- 8961 169 28 . . . 8961 170 1 6 6 CD 8961 170 2 . . . 8961 171 1 L. L. NNP 8961 171 2 H. H. NNP 8961 171 3 Thiel Thiel NNP 8961 171 4 and and CC 8961 171 5 H. H. NNP 8961 171 6 S. S. NNP 8961 171 7 Heaps Heaps NNP 8961 171 8 , , , 8961 171 9 " " '' 8961 171 10 Program Program NNP 8961 171 11 Design Design NNP 8961 171 12 for for IN 8961 171 13 Retrospective Retrospective NNP 8961 171 14 Searches Searches NNPS 8961 171 15 on on IN 8961 171 16 Large Large NNP 8961 171 17 Data datum NNS 8961 171 18 Bases basis NNS 8961 171 19 , , , 8961 171 20 " " `` 8961 171 21 Information Information NNP 8961 171 22 Storage Storage NNP 8961 171 23 & & CC 8961 171 24 Retrieval8:1 Retrieval8:1 NNP 8961 171 25 - - HYPH 8961 171 26 20 20 CD 8961 171 27 ( ( -LRB- 8961 171 28 1972 1972 CD 8961 171 29 ) ) -RRB- 8961 171 30 . . . 8961 172 1 7 7 LS 8961 172 2 . . . 8961 173 1 D. D. NNP 8961 173 2 R. R. NNP 8961 173 3 King King NNP 8961 173 4 , , , 8961 173 5 " " '' 8961 173 6 An an DT 8961 173 7 Inverted invert VBN 8961 173 8 File File NNP 8961 173 9 Structure structure NN 8961 173 10 for for IN 8961 173 11 an an DT 8961 173 12 Interactive Interactive NNP 8961 173 13 Document Document NNP 8961 173 14 Retrieval Retrieval NNP 8961 173 15 System System NNP 8961 173 16 " " '' 8961 173 17 ( ( -LRB- 8961 173 18 Ph.D. Ph.D. NNP 8961 173 19 dissertation dissertation NN 8961 173 20 , , , 8961 173 21 Rutgers Rutgers NNP 8961 173 22 University University NNP 8961 173 23 , , , 8961 173 24 1971 1971 CD 8961 173 25 ) ) -RRB- 8961 173 26 . . .