id sid tid token lemma pos 3352 1 1 SEARCH SEARCH NNS 3352 1 2 ACROSS acros VBD 3352 1 3 DIFFERENT DIFFERENT NNP 3352 1 4 MEDIA medium NNS 3352 1 5 | | CD 3352 1 6 BUCKLAND BUCKLAND NNP 3352 1 7 , , , 3352 1 8 CHEN CHEN NNP 3352 1 9 , , , 3352 1 10 GEY GEY NNP 3352 1 11 , , , 3352 1 12 AND and CC 3352 1 13 LARSON LARSON NNP 3352 1 14 181 181 CD 3352 1 15 Digital Digital NNP 3352 1 16 technology technology NN 3352 1 17 encourages encourage VBZ 3352 1 18 the the DT 3352 1 19 hope hope NN 3352 1 20 of of IN 3352 1 21 searching search VBG 3352 1 22 across across RB 3352 1 23 and and CC 3352 1 24 between between IN 3352 1 25 different different JJ 3352 1 26 media medium NNS 3352 1 27 forms form NNS 3352 1 28 ( ( -LRB- 3352 1 29 text text NN 3352 1 30 , , , 3352 1 31 sound sound NN 3352 1 32 , , , 3352 1 33 image image NN 3352 1 34 , , , 3352 1 35 numeric numeric JJ 3352 1 36 data datum NNS 3352 1 37 ) ) -RRB- 3352 1 38 . . . 3352 2 1 Topic topic JJ 3352 2 2 searches search NNS 3352 2 3 are be VBP 3352 2 4 described describe VBN 3352 2 5 in in IN 3352 2 6 two two CD 3352 2 7 different different JJ 3352 2 8 media medium NNS 3352 2 9 : : : 3352 2 10 text text NN 3352 2 11 files file NNS 3352 2 12 and and CC 3352 2 13 socioeconomic socioeconomic JJ 3352 2 14 numeric numeric JJ 3352 2 15 databases database NNS 3352 2 16 and and CC 3352 2 17 also also RB 3352 2 18 for for IN 3352 2 19 transverse transverse JJ 3352 2 20 searching searching NN 3352 2 21 , , , 3352 2 22 whereby whereby WRB 3352 2 23 retrieved retrieve VBN 3352 2 24 text text NN 3352 2 25 is be VBZ 3352 2 26 used use VBN 3352 2 27 to to TO 3352 2 28 find find VB 3352 2 29 topically topically RB 3352 2 30 related relate VBN 3352 2 31 numeric numeric JJ 3352 2 32 data datum NNS 3352 2 33 and and CC 3352 2 34 vice vice NN 3352 2 35 versa versa RB 3352 2 36 . . . 3352 3 1 Direct direct JJ 3352 3 2 transverse transverse NN 3352 3 3 searching search VBG 3352 3 4 across across IN 3352 3 5 different different JJ 3352 3 6 media media NN 3352 3 7 is be VBZ 3352 3 8 impossible impossible JJ 3352 3 9 . . . 3352 4 1 Descriptive descriptive JJ 3352 4 2 metadata metadata NN 3352 4 3 pro- pro- NN 3352 4 4 vide vide NN 3352 4 5 enabling enable VBG 3352 4 6 infrastructure infrastructure NN 3352 4 7 , , , 3352 4 8 but but CC 3352 4 9 usually usually RB 3352 4 10 require require VBP 3352 4 11 map- map- NN 3352 4 12 pings ping NNS 3352 4 13 between between IN 3352 4 14 different different JJ 3352 4 15 vocabularies vocabulary NNS 3352 4 16 and and CC 3352 4 17 a a DT 3352 4 18 search search NN 3352 4 19 - - HYPH 3352 4 20 term term NN 3352 4 21 recommender recommender NN 3352 4 22 system system NN 3352 4 23 . . . 3352 5 1 Statistical statistical JJ 3352 5 2 association association NN 3352 5 3 techniques technique NNS 3352 5 4 and and CC 3352 5 5 natural natural JJ 3352 5 6 - - HYPH 3352 5 7 language language NN 3352 5 8 processing processing NN 3352 5 9 can can MD 3352 5 10 help help VB 3352 5 11 . . . 3352 6 1 Searches search NNS 3352 6 2 in in IN 3352 6 3 socioeconomic socioeconomic JJ 3352 6 4 numeric numeric JJ 3352 6 5 databases database NNS 3352 6 6 ordinarily ordinarily RB 3352 6 7 require require VBP 3352 6 8 that that DT 3352 6 9 place place NN 3352 6 10 and and CC 3352 6 11 time time NN 3352 6 12 be be VB 3352 6 13 specified specify VBN 3352 6 14 . . . 3352 7 1 A a DT 3352 7 2 hope hope NN 3352 7 3 for for IN 3352 7 4 libraries library NNS 3352 7 5 is be VBZ 3352 7 6 that that DT 3352 7 7 new new JJ 3352 7 8 technology technology NN 3352 7 9 will will MD 3352 7 10 support support VB 3352 7 11 searching search VBG 3352 7 12 across across IN 3352 7 13 an an DT 3352 7 14 increasing increase VBG 3352 7 15 range range NN 3352 7 16 of of IN 3352 7 17 resources resource NNS 3352 7 18 in in IN 3352 7 19 a a DT 3352 7 20 growing grow VBG 3352 7 21 digital digital JJ 3352 7 22 landscape landscape NN 3352 7 23 . . . 3352 8 1 The the DT 3352 8 2 rise rise NN 3352 8 3 of of IN 3352 8 4 the the DT 3352 8 5 Internet internet NN 3352 8 6 provides provide VBZ 3352 8 7 a a DT 3352 8 8 technological technological JJ 3352 8 9 basis basis NN 3352 8 10 for for IN 3352 8 11 shared shared JJ 3352 8 12 access access NN 3352 8 13 to to IN 3352 8 14 a a DT 3352 8 15 very very RB 3352 8 16 wide wide JJ 3352 8 17 range range NN 3352 8 18 of of IN 3352 8 19 resources resource NNS 3352 8 20 . . . 3352 9 1 The the DT 3352 9 2 reality reality NN 3352 9 3 is be VBZ 3352 9 4 that that IN 3352 9 5 network network NN 3352 9 6 - - HYPH 3352 9 7 accessible accessible JJ 3352 9 8 resources resource NNS 3352 9 9 , , , 3352 9 10 like like IN 3352 9 11 the the DT 3352 9 12 contents content NNS 3352 9 13 of of IN 3352 9 14 a a DT 3352 9 15 well well RB 3352 9 16 - - HYPH 3352 9 17 stocked stock VBN 3352 9 18 reference reference NN 3352 9 19 library library NN 3352 9 20 , , , 3352 9 21 are be VBP 3352 9 22 quite quite RB 3352 9 23 heterogeneous heterogeneous JJ 3352 9 24 , , , 3352 9 25 especially especially RB 3352 9 26 in in IN 3352 9 27 the the DT 3352 9 28 variety variety NN 3352 9 29 of of IN 3352 9 30 indexing indexing NN 3352 9 31 , , , 3352 9 32 classification classification NN 3352 9 33 , , , 3352 9 34 catego- catego- NN 3352 9 35 rization rization NN 3352 9 36 , , , 3352 9 37 and and CC 3352 9 38 other other JJ 3352 9 39 forms form NNS 3352 9 40 of of IN 3352 9 41 metadata metadata NN 3352 9 42 . . . 3352 10 1 However however RB 3352 10 2 , , , 3352 10 3 the the DT 3352 10 4 use use NN 3352 10 5 of of IN 3352 10 6 digital digital JJ 3352 10 7 technology technology NN 3352 10 8 implies imply VBZ 3352 10 9 a a DT 3352 10 10 degree degree NN 3352 10 11 of of IN 3352 10 12 technical technical JJ 3352 10 13 compat- compat- NN 3352 10 14 ibility ibility NN 3352 10 15 between between IN 3352 10 16 different different JJ 3352 10 17 media medium NNS 3352 10 18 , , , 3352 10 19 sometimes sometimes RB 3352 10 20 referred refer VBD 3352 10 21 to to IN 3352 10 22 as as IN 3352 10 23 “ " `` 3352 10 24 media medium NNS 3352 10 25 convergence convergence NN 3352 10 26 , , , 3352 10 27 ” " '' 3352 10 28 and and CC 3352 10 29 these these DT 3352 10 30 developments development NNS 3352 10 31 encourage encourage VBP 3352 10 32 the the DT 3352 10 33 prospect prospect NN 3352 10 34 of of IN 3352 10 35 being be VBG 3352 10 36 able able JJ 3352 10 37 to to TO 3352 10 38 search search VB 3352 10 39 across across RB 3352 10 40 and and CC 3352 10 41 between between IN 3352 10 42 different different JJ 3352 10 43 media medium NNS 3352 10 44 forms form NNS 3352 10 45 — — : 3352 10 46 notably notably RB 3352 10 47 text text NN 3352 10 48 , , , 3352 10 49 images image NNS 3352 10 50 , , , 3352 10 51 sound sound JJ 3352 10 52 , , , 3352 10 53 and and CC 3352 10 54 numeric numeric JJ 3352 10 55 data data NN 3352 10 56 sets set NNS 3352 10 57 — — : 3352 10 58 for for IN 3352 10 59 different different JJ 3352 10 60 kinds kind NNS 3352 10 61 of of IN 3352 10 62 material material NN 3352 10 63 relat- relat- NN 3352 10 64 ing ing NNP 3352 10 65 to to IN 3352 10 66 the the DT 3352 10 67 same same JJ 3352 10 68 topic topic NN 3352 10 69 . . . 3352 11 1 To to TO 3352 11 2 examine examine VB 3352 11 3 the the DT 3352 11 4 practical practical JJ 3352 11 5 problems problem NNS 3352 11 6 involved involve VBN 3352 11 7 , , , 3352 11 8 the the DT 3352 11 9 authors author NNS 3352 11 10 undertook undertake VBD 3352 11 11 to to TO 3352 11 12 demonstrate demonstrate VB 3352 11 13 searching search VBG 3352 11 14 between between IN 3352 11 15 and and CC 3352 11 16 across across IN 3352 11 17 two two CD 3352 11 18 different different JJ 3352 11 19 media medium NNS 3352 11 20 forms form NNS 3352 11 21 : : : 3352 11 22 text text NN 3352 11 23 files file NNS 3352 11 24 and and CC 3352 11 25 socioeconomic socioeconomic JJ 3352 11 26 numeric numeric JJ 3352 11 27 data datum NNS 3352 11 28 sets.1 sets.1 CD 3352 11 29 Two two CD 3352 11 30 kinds kind NNS 3352 11 31 of of IN 3352 11 32 search search NN 3352 11 33 are be VBP 3352 11 34 needed need VBN 3352 11 35 . . . 3352 12 1 First first RB 3352 12 2 , , , 3352 12 3 it -PRON- PRP 3352 12 4 should should MD 3352 12 5 be be VB 3352 12 6 pos- pos- VBN 3352 12 7 sible sible JJ 3352 12 8 to to TO 3352 12 9 do do VB 3352 12 10 a a DT 3352 12 11 topical topical JJ 3352 12 12 search search NN 3352 12 13 in in IN 3352 12 14 multiple multiple JJ 3352 12 15 media medium NNS 3352 12 16 resources resource NNS 3352 12 17 , , , 3352 12 18 so so IN 3352 12 19 that that IN 3352 12 20 one one PRP 3352 12 21 can can MD 3352 12 22 find find VB 3352 12 23 , , , 3352 12 24 for for IN 3352 12 25 example example NN 3352 12 26 , , , 3352 12 27 both both DT 3352 12 28 pertinent pertinent JJ 3352 12 29 factual factual JJ 3352 12 30 numeric numeric JJ 3352 12 31 data datum NNS 3352 12 32 and and CC 3352 12 33 relevant relevant JJ 3352 12 34 discussion discussion NN 3352 12 35 . . . 3352 13 1 ( ( -LRB- 3352 13 2 One one CD 3352 13 3 difficulty difficulty NN 3352 13 4 is be VBZ 3352 13 5 that that IN 3352 13 6 the the DT 3352 13 7 vocabulary vocabulary NN 3352 13 8 used use VBD 3352 13 9 to to TO 3352 13 10 classify classify VB 3352 13 11 the the DT 3352 13 12 numeric numeric JJ 3352 13 13 data datum NNS 3352 13 14 is be VBZ 3352 13 15 ordinarily ordinarily RB 3352 13 16 quite quite RB 3352 13 17 different different JJ 3352 13 18 from from IN 3352 13 19 the the DT 3352 13 20 subject subject JJ 3352 13 21 headings heading NNS 3352 13 22 used use VBN 3352 13 23 for for IN 3352 13 24 books book NNS 3352 13 25 , , , 3352 13 26 magazine magazine NN 3352 13 27 articles article NNS 3352 13 28 , , , 3352 13 29 and and CC 3352 13 30 newspaper newspaper NN 3352 13 31 stories story NNS 3352 13 32 about about IN 3352 13 33 the the DT 3352 13 34 same same JJ 3352 13 35 topic topic NN 3352 13 36 . . . 3352 13 37 ) ) -RRB- 3352 14 1 Second second JJ 3352 14 2 , , , 3352 14 3 when when WRB 3352 14 4 intriguing intriguing JJ 3352 14 5 data data NN 3352 14 6 values value NNS 3352 14 7 are be VBP 3352 14 8 encountered encounter VBN 3352 14 9 , , , 3352 14 10 one one PRP 3352 14 11 would would MD 3352 14 12 like like VB 3352 14 13 to to TO 3352 14 14 move move VB 3352 14 15 directly directly RB 3352 14 16 to to IN 3352 14 17 topically topically RB 3352 14 18 relevant relevant JJ 3352 14 19 texts text NNS 3352 14 20 . . . 3352 15 1 Likewise likewise RB 3352 15 2 , , , 3352 15 3 when when WRB 3352 15 4 a a DT 3352 15 5 questionable questionable JJ 3352 15 6 statement statement NN 3352 15 7 is be VBZ 3352 15 8 read read VBN 3352 15 9 , , , 3352 15 10 one one PRP 3352 15 11 would would MD 3352 15 12 like like VB 3352 15 13 to to TO 3352 15 14 be be VB 3352 15 15 able able JJ 3352 15 16 to to TO 3352 15 17 find find VB 3352 15 18 relevant relevant JJ 3352 15 19 statisti- statisti- NN 3352 15 20 cal cal NN 3352 15 21 evidence evidence NN 3352 15 22 . . . 3352 16 1 Therefore therefore RB 3352 16 2 , , , 3352 16 3 there there EX 3352 16 4 needs need VBZ 3352 16 5 to to TO 3352 16 6 be be VB 3352 16 7 search search NN 3352 16 8 support support NN 3352 16 9 that that WDT 3352 16 10 facilitates facilitate VBZ 3352 16 11 such such JJ 3352 16 12 transverse transverse NN 3352 16 13 searching search VBG 3352 16 14 among among IN 3352 16 15 resources resource NNS 3352 16 16 , , , 3352 16 17 establishing establish VBG 3352 16 18 connections connection NNS 3352 16 19 , , , 3352 16 20 transferring transfer VBG 3352 16 21 data datum NNS 3352 16 22 , , , 3352 16 23 and and CC 3352 16 24 invoking invoke VBG 3352 16 25 appropriate appropriate JJ 3352 16 26 utilities utility NNS 3352 16 27 in in IN 3352 16 28 a a DT 3352 16 29 helpful helpful JJ 3352 16 30 way way NN 3352 16 31 . . . 3352 17 1 Both both DT 3352 17 2 problems problem NNS 3352 17 3 were be VBD 3352 17 4 addressed address VBN 3352 17 5 through through IN 3352 17 6 the the DT 3352 17 7 design design NN 3352 17 8 and and CC 3352 17 9 demonstration demonstration NN 3352 17 10 of of IN 3352 17 11 a a DT 3352 17 12 gateway gateway NN 3352 17 13 providing providing NN 3352 17 14 search search NN 3352 17 15 sup- sup- JJ 3352 17 16 port port NN 3352 17 17 for for IN 3352 17 18 both both DT 3352 17 19 text text NN 3352 17 20 and and CC 3352 17 21 socioeconomic socioeconomic JJ 3352 17 22 numeric numeric JJ 3352 17 23 databases database NNS 3352 17 24 . . . 3352 18 1 First first RB 3352 18 2 , , , 3352 18 3 the the DT 3352 18 4 gateway gateway NN 3352 18 5 should should MD 3352 18 6 help help VB 3352 18 7 users user NNS 3352 18 8 conduct conduct VB 3352 18 9 searches search NNS 3352 18 10 in in IN 3352 18 11 databases database NNS 3352 18 12 of of IN 3352 18 13 different different JJ 3352 18 14 media medium NNS 3352 18 15 forms form NNS 3352 18 16 by by IN 3352 18 17 accepting accept VBG 3352 18 18 a a DT 3352 18 19 query query NN 3352 18 20 in in IN 3352 18 21 the the DT 3352 18 22 searcher searcher NN 3352 18 23 ’s ’s POS 3352 18 24 own own JJ 3352 18 25 terms term NNS 3352 18 26 and and CC 3352 18 27 then then RB 3352 18 28 suggesting suggest VBG 3352 18 29 the the DT 3352 18 30 spe- spe- NN 3352 18 31 cialized cialize VBN 3352 18 32 categorization categorization NN 3352 18 33 terms term NNS 3352 18 34 to to TO 3352 18 35 search search VB 3352 18 36 for for IN 3352 18 37 in in IN 3352 18 38 the the DT 3352 18 39 selected select VBN 3352 18 40 resource resource NN 3352 18 41 . . . 3352 19 1 Second second JJ 3352 19 2 , , , 3352 19 3 if if IN 3352 19 4 something something NN 3352 19 5 interesting interesting JJ 3352 19 6 was be VBD 3352 19 7 found find VBN 3352 19 8 in in IN 3352 19 9 a a DT 3352 19 10 socioeconomic socioeconomic JJ 3352 19 11 database database NN 3352 19 12 , , , 3352 19 13 the the DT 3352 19 14 gateway gateway NN 3352 19 15 would would MD 3352 19 16 help help VB 3352 19 17 the the DT 3352 19 18 searcher searcher NN 3352 19 19 to to TO 3352 19 20 find find VB 3352 19 21 documents document NNS 3352 19 22 on on IN 3352 19 23 the the DT 3352 19 24 same same JJ 3352 19 25 topic topic NN 3352 19 26 in in IN 3352 19 27 a a DT 3352 19 28 text text NN 3352 19 29 database database NN 3352 19 30 , , , 3352 19 31 and and CC 3352 19 32 vice vice NN 3352 19 33 versa versa RB 3352 19 34 . . . 3352 20 1 Selection selection NN 3352 20 2 of of IN 3352 20 3 the the DT 3352 20 4 best good JJS 3352 20 5 search search NN 3352 20 6 terms term NNS 3352 20 7 in in IN 3352 20 8 target target NN 3352 20 9 databases database NNS 3352 20 10 is be VBZ 3352 20 11 supported support VBN 3352 20 12 by by IN 3352 20 13 the the DT 3352 20 14 use use NN 3352 20 15 of of IN 3352 20 16 indexes index NNS 3352 20 17 to to IN 3352 20 18 the the DT 3352 20 19 categories category NNS 3352 20 20 ( ( -LRB- 3352 20 21 entries entry NNS 3352 20 22 , , , 3352 20 23 headings heading NNS 3352 20 24 , , , 3352 20 25 class class NN 3352 20 26 numbers number NNS 3352 20 27 ) ) -RRB- 3352 20 28 in in IN 3352 20 29 the the DT 3352 20 30 system system NN 3352 20 31 to to TO 3352 20 32 be be VB 3352 20 33 searched search VBN 3352 20 34 . . . 3352 21 1 These these DT 3352 21 2 search search NN 3352 21 3 - - HYPH 3352 21 4 term term NN 3352 21 5 recommender recommender NN 3352 21 6 systems system NNS 3352 21 7 ( ( -LRB- 3352 21 8 also also RB 3352 21 9 known know VBN 3352 21 10 as as IN 3352 21 11 “ " `` 3352 21 12 entry entry NN 3352 21 13 vocabulary vocabulary NN 3352 21 14 indexes index NNS 3352 21 15 ” " '' 3352 21 16 ) ) -RRB- 3352 21 17 resemble resemble JJ 3352 21 18 Dewey Dewey NNP 3352 21 19 ’s ’s POS 3352 21 20 “ " `` 3352 21 21 Relativ Relativ NNP 3352 21 22 Index Index NNP 3352 21 23 , , , 3352 21 24 ” " '' 3352 21 25 but but CC 3352 21 26 are be VBP 3352 21 27 created create VBN 3352 21 28 using use VBG 3352 21 29 statistical statistical JJ 3352 21 30 association association NN 3352 21 31 techniques.2 techniques.2 CC 3352 21 32 Four four CD 3352 21 33 characteristics characteristic NNS 3352 21 34 of of IN 3352 21 35 this this DT 3352 21 36 investigation investigation NN 3352 21 37 need need VBP 3352 21 38 to to TO 3352 21 39 be be VB 3352 21 40 noted note VBN 3352 21 41 : : : 3352 21 42 1 1 CD 3352 21 43 . . . 3352 22 1 Searching search VBG 3352 22 2 independent independent JJ 3352 22 3 sources source NNS 3352 22 4 : : : 3352 22 5 The the DT 3352 22 6 authors author NNS 3352 22 7 were be VBD 3352 22 8 not not RB 3352 22 9 concerned concern VBN 3352 22 10 with with IN 3352 22 11 ingesting ingest VBG 3352 22 12 resources resource NNS 3352 22 13 from from IN 3352 22 14 differ- differ- JJ 3352 22 15 ent ent NN 3352 22 16 sources source NNS 3352 22 17 into into IN 3352 22 18 a a DT 3352 22 19 consolidated consolidated JJ 3352 22 20 local local JJ 3352 22 21 data datum NNS 3352 22 22 repository repository NN 3352 22 23 and and CC 3352 22 24 searching search VBG 3352 22 25 within within IN 3352 22 26 it -PRON- PRP 3352 22 27 . . . 3352 23 1 The the DT 3352 23 2 interest interest NN 3352 23 3 lay lie VBD 3352 23 4 , , , 3352 23 5 instead instead RB 3352 23 6 , , , 3352 23 7 in in IN 3352 23 8 being be VBG 3352 23 9 able able JJ 3352 23 10 to to TO 3352 23 11 search search VB 3352 23 12 effectively effectively RB 3352 23 13 in in IN 3352 23 14 any any DT 3352 23 15 accessible accessible JJ 3352 23 16 resource resource NN 3352 23 17 as as IN 3352 23 18 and and CC 3352 23 19 when when WRB 3352 23 20 one one PRP 3352 23 21 wants want VBZ 3352 23 22 . . . 3352 24 1 This this DT 3352 24 2 implies imply VBZ 3352 24 3 that that IN 3352 24 4 interoperability interoperability NN 3352 24 5 issues issue NNS 3352 24 6 in in IN 3352 24 7 dealing deal VBG 3352 24 8 with with IN 3352 24 9 the the DT 3352 24 10 native native JJ 3352 24 11 query query NN 3352 24 12 languages language NNS 3352 24 13 and and CC 3352 24 14 metadata metadata NN 3352 24 15 vocabularies vocabulary NNS 3352 24 16 of of IN 3352 24 17 remote remote JJ 3352 24 18 repositories repository NNS 3352 24 19 can can MD 3352 24 20 be be VB 3352 24 21 solved solve VBN 3352 24 22 . . . 3352 25 1 2 2 LS 3352 25 2 . . . 3352 26 1 Search search VB 3352 26 2 for for IN 3352 26 3 independent independent JJ 3352 26 4 content content NN 3352 26 5 : : : 3352 26 6 Numeric numeric JJ 3352 26 7 data datum NNS 3352 26 8 sets set NNS 3352 26 9 commonly commonly RB 3352 26 10 have have VBP 3352 26 11 associated associate VBN 3352 26 12 text text NN 3352 26 13 in in IN 3352 26 14 the the DT 3352 26 15 form form NN 3352 26 16 of of IN 3352 26 17 documentation documentation NN 3352 26 18 , , , 3352 26 19 code code NN 3352 26 20 books book NNS 3352 26 21 , , , 3352 26 22 and and CC 3352 26 23 commentary commentary NN 3352 26 24 . . . 3352 27 1 However however RB 3352 27 2 , , , 3352 27 3 the the DT 3352 27 4 authors author NNS 3352 27 5 were be VBD 3352 27 6 interested interested JJ 3352 27 7 in in IN 3352 27 8 finding find VBG 3352 27 9 topical topical JJ 3352 27 10 content content NN 3352 27 11 that that WDT 3352 27 12 had have VBD 3352 27 13 no no DT 3352 27 14 such such JJ 3352 27 15 formal formal JJ 3352 27 16 or or CC 3352 27 17 liter- liter- JJ 3352 27 18 ary ary JJ 3352 27 19 connection connection NN 3352 27 20 . . . 3352 28 1 Independent independent JJ 3352 28 2 means mean NNS 3352 28 3 , , , 3352 28 4 for for IN 3352 28 5 example example NN 3352 28 6 , , , 3352 28 7 a a DT 3352 28 8 newspaper newspaper NN 3352 28 9 article article NN 3352 28 10 written write VBN 3352 28 11 by by IN 3352 28 12 someone someone NN 3352 28 13 unaware unaware JJ 3352 28 14 that that DT 3352 28 15 relevant relevant JJ 3352 28 16 statistical statistical JJ 3352 28 17 data datum NNS 3352 28 18 existed exist VBD 3352 28 19 or or CC 3352 28 20 had have VBD 3352 28 21 been be VBN 3352 28 22 written write VBN 3352 28 23 before before IN 3352 28 24 the the DT 3352 28 25 author author NN 3352 28 26 ’s ’s POS 3352 28 27 article article NN 3352 28 28 existed exist VBD 3352 28 29 . . . 3352 29 1 In in IN 3352 29 2 the the DT 3352 29 3 other other JJ 3352 29 4 direction direction NN 3352 29 5 , , , 3352 29 6 having have VBG 3352 29 7 found find VBN 3352 29 8 statistical statistical JJ 3352 29 9 data datum NNS 3352 29 10 of of IN 3352 29 11 interest interest NN 3352 29 12 , , , 3352 29 13 could could MD 3352 29 14 topically topically RB 3352 29 15 related relate VBN 3352 29 16 text text NN 3352 29 17 created create VBN 3352 29 18 inde- inde- VBP 3352 29 19 pendently pendently RB 3352 29 20 of of IN 3352 29 21 this this DT 3352 29 22 particular particular JJ 3352 29 23 data data NN 3352 29 24 point point NN 3352 29 25 be be VB 3352 29 26 found find VBN 3352 29 27 ? ? . 3352 30 1 3 3 LS 3352 30 2 . . . 3352 31 1 Two two CD 3352 31 2 different different JJ 3352 31 3 media medium NNS 3352 31 4 forms form NNS 3352 31 5 were be VBD 3352 31 6 chosen choose VBN 3352 31 7 : : : 3352 31 8 text text NN 3352 31 9 and and CC 3352 31 10 numeric numeric JJ 3352 31 11 data datum NNS 3352 31 12 sets set NNS 3352 31 13 . . . 3352 32 1 They -PRON- PRP 3352 32 2 look look VBP 3352 32 3 similar similar JJ 3352 32 4 because because IN 3352 32 5 they -PRON- PRP 3352 32 6 both both DT 3352 32 7 use use VBP 3352 32 8 arabic arabic JJ 3352 32 9 numerals numeral NNS 3352 32 10 , , , 3352 32 11 but but CC 3352 32 12 the the DT 3352 32 13 traditional traditional JJ 3352 32 14 reli- reli- NN 3352 32 15 ance ance NN 3352 32 16 on on IN 3352 32 17 information information NN 3352 32 18 retrieval retrieval NN 3352 32 19 in in IN 3352 32 20 a a DT 3352 32 21 text text NN 3352 32 22 environment environment NN 3352 32 23 Search search NN 3352 32 24 across across IN 3352 32 25 Different different JJ 3352 32 26 Media medium NNS 3352 32 27 : : : 3352 32 28 Numeric Numeric NNP 3352 32 29 Data Data NNP 3352 32 30 Sets Sets NNPS 3352 32 31 and and CC 3352 32 32 Text Text NNP 3352 32 33 Files Files NNP 3352 32 34 Michael Michael NNP 3352 32 35 Buckland Buckland NNP 3352 32 36 , , , 3352 32 37 Aitao Aitao NNP 3352 32 38 Chen Chen NNP 3352 32 39 , , , 3352 32 40 Fredric Fredric NNP 3352 32 41 C. C. NNP 3352 32 42 Gey Gey NNP 3352 32 43 , , , 3352 32 44 and and CC 3352 32 45 Ray Ray NNP 3352 32 46 R. R. NNP 3352 32 47 Larson Larson NNP 3352 32 48 Michael Michael NNP 3352 32 49 Buckland Buckland NNP 3352 32 50 ( ( -LRB- 3352 32 51 buckland@sims.berkeley.edu buckland@sims.berkeley.edu NNP 3352 32 52 ) ) -RRB- 3352 32 53 is be VBZ 3352 32 54 Emeritus Emeritus NNP 3352 32 55 Professor Professor NNP 3352 32 56 , , , 3352 32 57 School School NNP 3352 32 58 of of IN 3352 32 59 Information Information NNP 3352 32 60 , , , 3352 32 61 University University NNP 3352 32 62 of of IN 3352 32 63 California California NNP 3352 32 64 , , , 3352 32 65 Berkeley Berkeley NNP 3352 32 66 ; ; : 3352 32 67 Aitao Aitao NNP 3352 32 68 Chen Chen NNP 3352 32 69 ( ( -LRB- 3352 32 70 aitao@yahoo-inc.com aitao@yahoo-inc.com ADD 3352 32 71 ) ) -RRB- 3352 32 72 is be VBZ 3352 32 73 a a DT 3352 32 74 researcher researcher NN 3352 32 75 at at IN 3352 32 76 Yahoo Yahoo NNP 3352 32 77 ! ! . 3352 32 78 , , , 3352 32 79 Sunnyvale Sunnyvale NNP 3352 32 80 , , , 3352 32 81 California California NNP 3352 32 82 ; ; : 3352 32 83 Fredric Fredric NNP 3352 32 84 C. C. NNP 3352 32 85 Gey Gey NNP 3352 32 86 ( ( -LRB- 3352 32 87 gey@berkeley gey@berkeley NNP 3352 32 88 .edu .edu NFP 3352 32 89 ) ) -RRB- 3352 32 90 is be VBZ 3352 32 91 an an DT 3352 32 92 Information Information NNP 3352 32 93 Scientist Scientist NNP 3352 32 94 , , , 3352 32 95 UC UC NNP 3352 32 96 Data Data NNP 3352 32 97 Archive Archive NNP 3352 32 98 and and CC 3352 32 99 Technical Technical NNP 3352 32 100 Assistance Assistance NNP 3352 32 101 at at IN 3352 32 102 the the DT 3352 32 103 University University NNP 3352 32 104 of of IN 3352 32 105 California California NNP 3352 32 106 , , , 3352 32 107 Berkeley Berkeley NNP 3352 32 108 ; ; : 3352 32 109 and and CC 3352 32 110 Ray Ray NNP 3352 32 111 R. R. NNP 3352 32 112 Larson Larson NNP 3352 32 113 ( ( -LRB- 3352 32 114 ray@sims.berkeley.edu ray@sims.berkeley.edu NNP 3352 32 115 ) ) -RRB- 3352 32 116 is be VBZ 3352 32 117 a a DT 3352 32 118 Professor Professor NNP 3352 32 119 , , , 3352 32 120 School School NNP 3352 32 121 of of IN 3352 32 122 Information Information NNP 3352 32 123 at at IN 3352 32 124 the the DT 3352 32 125 University University NNP 3352 32 126 of of IN 3352 32 127 California California NNP 3352 32 128 , , , 3352 32 129 Berkeley Berkeley NNP 3352 32 130 . . . 3352 33 1 182 182 CD 3352 33 2 INFORMATION information NN 3352 33 3 TECHNOLOGY technology NN 3352 33 4 AND and CC 3352 33 5 LIBRARIES library NNS 3352 33 6 | | CD 3352 33 7 DECEMBER DECEMBER NNP 3352 33 8 2006 2006 CD 3352 33 9 of of IN 3352 33 10 using use VBG 3352 33 11 any any DT 3352 33 12 character character NN 3352 33 13 string string NN 3352 33 14 from from IN 3352 33 15 the the DT 3352 33 16 corpus corpus NNP 3352 33 17 as as IN 3352 33 18 a a DT 3352 33 19 query query NN 3352 33 20 , , , 3352 33 21 although although IN 3352 33 22 technically technically RB 3352 33 23 feasible feasible JJ 3352 33 24 , , , 3352 33 25 can can MD 3352 33 26 not not RB 3352 33 27 be be VB 3352 33 28 expected expect VBN 3352 33 29 to to TO 3352 33 30 be be VB 3352 33 31 useful useful JJ 3352 33 32 here here RB 3352 33 33 . . . 3352 34 1 One one PRP 3352 34 2 can can MD 3352 34 3 copy copy VB 3352 34 4 a a DT 3352 34 5 number number NN 3352 34 6 expressing express VBG 3352 34 7 quantity quantity NN 3352 34 8 , , , 3352 34 9 such such JJ 3352 34 10 as as IN 3352 34 11 12,941 12,941 CD 3352 34 12 , , , 3352 34 13 from from IN 3352 34 14 a a DT 3352 34 15 numeric numeric JJ 3352 34 16 data data NN 3352 34 17 cell cell NN 3352 34 18 , , , 3352 34 19 use use VB 3352 34 20 it -PRON- PRP 3352 34 21 as as IN 3352 34 22 a a DT 3352 34 23 query query NN 3352 34 24 in in IN 3352 34 25 a a DT 3352 34 26 text text NN 3352 34 27 search search NN 3352 34 28 engine engine NN 3352 34 29 such such JJ 3352 34 30 as as IN 3352 34 31 Google Google NNP 3352 34 32 , , , 3352 34 33 and and CC 3352 34 34 retrieve retrieve VB 3352 34 35 a a DT 3352 34 36 large large JJ 3352 34 37 and and CC 3352 34 38 eclectic eclectic JJ 3352 34 39 retrieved retrieved JJ 3352 34 40 set set NN 3352 34 41 , , , 3352 34 42 usually usually RB 3352 34 43 involving involve VBG 3352 34 44 “ " `` 3352 34 45 12941 12941 CD 3352 34 46 ” " '' 3352 34 47 as as IN 3352 34 48 an an DT 3352 34 49 iden- iden- NN 3352 34 50 tifying tifye VBG 3352 34 51 number number NN 3352 34 52 for for IN 3352 34 53 a a DT 3352 34 54 postal postal JJ 3352 34 55 code code NN 3352 34 56 , , , 3352 34 57 a a DT 3352 34 58 memorandum memorandum NN 3352 34 59 , , , 3352 34 60 a a DT 3352 34 61 part part NN 3352 34 62 number number NN 3352 34 63 , , , 3352 34 64 software software NN 3352 34 65 bug bug NN 3352 34 66 report report NN 3352 34 67 , , , 3352 34 68 and and CC 3352 34 69 so so RB 3352 34 70 on on RB 3352 34 71 , , , 3352 34 72 but but CC 3352 34 73 the the DT 3352 34 74 relationship relationship NN 3352 34 75 is be VBZ 3352 34 76 spurious spurious JJ 3352 34 77 . . . 3352 35 1 It -PRON- PRP 3352 35 2 requires require VBZ 3352 35 3 great great JJ 3352 35 4 faith faith NN 3352 35 5 in in IN 3352 35 6 numerology numerology NN 3352 35 7 to to TO 3352 35 8 expect expect VB 3352 35 9 anything anything NN 3352 35 10 topically topically RB 3352 35 11 mean- mean- NN 3352 35 12 ingful ingful JJ 3352 35 13 to to IN 3352 35 14 the the DT 3352 35 15 original original JJ 3352 35 16 data data NN 3352 35 17 cell cell NN 3352 35 18 one one CD 3352 35 19 started start VBD 3352 35 20 with with IN 3352 35 21 . . . 3352 36 1 With with IN 3352 36 2 other other JJ 3352 36 3 combinations combination NNS 3352 36 4 of of IN 3352 36 5 media medium NNS 3352 36 6 forms form NNS 3352 36 7 , , , 3352 36 8 not not RB 3352 36 9 even even RB 3352 36 10 spurious spurious JJ 3352 36 11 results result NNS 3352 36 12 are be VBP 3352 36 13 feasible feasible JJ 3352 36 14 : : : 3352 36 15 one one PRP 3352 36 16 can can MD 3352 36 17 not not RB 3352 36 18 submit submit VB 3352 36 19 a a DT 3352 36 20 musical musical JJ 3352 36 21 fragment fragment NN 3352 36 22 or or CC 3352 36 23 some some DT 3352 36 24 pixels pixel NNS 3352 36 25 from from IN 3352 36 26 an an DT 3352 36 27 image image NN 3352 36 28 as as IN 3352 36 29 a a DT 3352 36 30 text text NN 3352 36 31 query query NN 3352 36 32 . . . 3352 37 1 4 4 LS 3352 37 2 . . . 3352 38 1 The the DT 3352 38 2 authors author NNS 3352 38 3 ’ ’ POS 3352 38 4 interest interest NN 3352 38 5 was be VBD 3352 38 6 in in IN 3352 38 7 how how WRB 3352 38 8 to to TO 3352 38 9 achieve achieve VB 3352 38 10 a a DT 3352 38 11 bet- bet- JJ 3352 38 12 ter ter NN 3352 38 13 return return NN 3352 38 14 on on IN 3352 38 15 existing exist VBG 3352 38 16 investments investment NNS 3352 38 17 in in IN 3352 38 18 well well RB 3352 38 19 - - HYPH 3352 38 20 formed form VBN 3352 38 21 , , , 3352 38 22 edited edit VBN 3352 38 23 resources resource NNS 3352 38 24 with with IN 3352 38 25 descriptive descriptive JJ 3352 38 26 metadata metadata NN 3352 38 27 . . . 3352 39 1 This this DT 3352 39 2 project project NN 3352 39 3 built build VBN 3352 39 4 directly directly RB 3352 39 5 on on IN 3352 39 6 prior prior JJ 3352 39 7 work work NN 3352 39 8 on on IN 3352 39 9 how how WRB 3352 39 10 to to TO 3352 39 11 make make VB 3352 39 12 more more JJR 3352 39 13 effective effective JJ 3352 39 14 use use NN 3352 39 15 of of IN 3352 39 16 existing existing JJ 3352 39 17 , , , 3352 39 18 expertly expertly RB 3352 39 19 developed develop VBN 3352 39 20 metadata metadata NN 3352 39 21 , , , 3352 39 22 rather rather RB 3352 39 23 than than IN 3352 39 24 creating create VBG 3352 39 25 or or CC 3352 39 26 replacing replace VBG 3352 39 27 meta- meta- JJ 3352 39 28 data datum NNS 3352 39 29 . . . 3352 40 1 Search search NN 3352 40 2 of of IN 3352 40 3 multiple multiple JJ 3352 40 4 resources resource NNS 3352 40 5 comes come VBZ 3352 40 6 in in IN 3352 40 7 two two CD 3352 40 8 forms form NNS 3352 40 9 : : : 3352 40 10 1 1 CD 3352 40 11 . . . 3352 41 1 Parallel parallel JJ 3352 41 2 search search NN 3352 41 3 is be VBZ 3352 41 4 when when WRB 3352 41 5 a a DT 3352 41 6 single single JJ 3352 41 7 query query NN 3352 41 8 is be VBZ 3352 41 9 sent send VBN 3352 41 10 to to IN 3352 41 11 two two CD 3352 41 12 or or CC 3352 41 13 more more JJR 3352 41 14 resources resource NNS 3352 41 15 at at IN 3352 41 16 more more RBR 3352 41 17 or or CC 3352 41 18 less less RBR 3352 41 19 the the DT 3352 41 20 same same JJ 3352 41 21 time time NN 3352 41 22 . . . 3352 42 1 For for IN 3352 42 2 example example NN 3352 42 3 , , , 3352 42 4 a a DT 3352 42 5 researcher researcher NN 3352 42 6 interested interested JJ 3352 42 7 in in IN 3352 42 8 the the DT 3352 42 9 import import NN 3352 42 10 of of IN 3352 42 11 shrimp shrimp NN 3352 42 12 would would MD 3352 42 13 like like VB 3352 42 14 to to TO 3352 42 15 see see VB 3352 42 16 pertinent pertinent JJ 3352 42 17 newspaper newspaper NN 3352 42 18 articles article NNS 3352 42 19 and and CC 3352 42 20 trade trade NN 3352 42 21 statistics statistic NNS 3352 42 22 . . . 3352 43 1 Thus thus RB 3352 43 2 , , , 3352 43 3 one one PRP 3352 43 4 might may MD 3352 43 5 send send VB 3352 43 6 a a DT 3352 43 7 query query NN 3352 43 8 to to IN 3352 43 9 the the DT 3352 43 10 Census Census NNP 3352 43 11 Bureau Bureau NNP 3352 43 12 ’s ’s POS 3352 43 13 United United NNP 3352 43 14 States States NNP 3352 43 15 ( ( -LRB- 3352 43 16 U.S. U.S. NNP 3352 43 17 ) ) -RRB- 3352 43 18 Imports Imports NNPS 3352 43 19 and and CC 3352 43 20 Exports Exports NNPS 3352 43 21 numeric numeric JJ 3352 43 22 data datum NNS 3352 43 23 series series NN 3352 43 24 and and CC 3352 43 25 look look VB 3352 43 26 at at IN 3352 43 27 SIC sic NN 3352 43 28 0913 0913 CD 3352 43 29 for for IN 3352 43 30 shrimp shrimp NN 3352 43 31 and and CC 3352 43 32 prawn prawn NNP 3352 43 33 and and CC 3352 43 34 note note VB 3352 43 35 a a DT 3352 43 36 dra- dra- RB 3352 43 37 matic matic JJ 3352 43 38 increase increase NN 3352 43 39 in in IN 3352 43 40 imports import NNS 3352 43 41 from from IN 3352 43 42 Vietnam Vietnam NNP 3352 43 43 through through IN 3352 43 44 Los Los NNP 3352 43 45 Angeles Angeles NNP 3352 43 46 from from IN 3352 43 47 1995 1995 CD 3352 43 48 onwards onwards RB 3352 43 49 . . . 3352 44 1 One one PRP 3352 44 2 would would MD 3352 44 3 also also RB 3352 44 4 search search VB 3352 44 5 newspaper newspaper NN 3352 44 6 indexes index NNS 3352 44 7 for for IN 3352 44 8 articles article NNS 3352 44 9 such such JJ 3352 44 10 as as IN 3352 44 11 “ " `` 3352 44 12 Normalizing normalize VBG 3352 44 13 ties tie NNS 3352 44 14 to to IN 3352 44 15 Vietnam Vietnam NNP 3352 44 16 important important JJ 3352 44 17 steps step NNS 3352 44 18 for for IN 3352 44 19 U.S. U.S. NNP 3352 44 20 firms firm NNS 3352 44 21 ; ; : 3352 44 22 California California NNP 3352 44 23 stands stand VBZ 3352 44 24 to to TO 3352 44 25 profit profit VB 3352 44 26 handsomely handsomely RB 3352 44 27 when when WRB 3352 44 28 barriers barrier NNS 3352 44 29 fall fall VBP 3352 44 30 to to IN 3352 44 31 trade trade NN 3352 44 32 with with IN 3352 44 33 fast fast RB 3352 44 34 - - HYPH 3352 44 35 growing grow VBG 3352 44 36 coun- coun- XX 3352 44 37 try try NN 3352 44 38 . . . 3352 44 39 ”3 ”3 JJ 3352 44 40 Different different JJ 3352 44 41 sources source NNS 3352 44 42 are be VBP 3352 44 43 likely likely JJ 3352 44 44 to to TO 3352 44 45 use use VB 3352 44 46 different different JJ 3352 44 47 index index NN 3352 44 48 terms term NNS 3352 44 49 or or CC 3352 44 50 categories category NNS 3352 44 51 , , , 3352 44 52 so so IN 3352 44 53 the the DT 3352 44 54 challenge challenge NN 3352 44 55 is be VBZ 3352 44 56 how how WRB 3352 44 57 to to TO 3352 44 58 express express VB 3352 44 59 the the DT 3352 44 60 searcher searcher NN 3352 44 61 ’s ’s NNP 3352 44 62 query query NN 3352 44 63 in in IN 3352 44 64 terms term NNS 3352 44 65 that that WDT 3352 44 66 will will MD 3352 44 67 be be VB 3352 44 68 effective effective JJ 3352 44 69 for for IN 3352 44 70 searching search VBG 3352 44 71 in in IN 3352 44 72 the the DT 3352 44 73 target target NN 3352 44 74 resources resource NNS 3352 44 75 , , , 3352 44 76 which which WDT 3352 44 77 , , , 3352 44 78 mostly mostly RB 3352 44 79 likely likely JJ 3352 44 80 , , , 3352 44 81 will will MD 3352 44 82 use use VB 3352 44 83 different different JJ 3352 44 84 vocabular- vocabular- NN 3352 44 85 ies ie NNS 3352 44 86 . . . 3352 45 1 As as IN 3352 45 2 one one CD 3352 45 3 example example NN 3352 45 4 , , , 3352 45 5 the the DT 3352 45 6 term term NN 3352 45 7 for for IN 3352 45 8 “ " `` 3352 45 9 automobiles automobile NNS 3352 45 10 ” " '' 3352 45 11 is be VBZ 3352 45 12 3711 3711 CD 3352 45 13 in in IN 3352 45 14 the the DT 3352 45 15 Standard Standard NNP 3352 45 16 Industrial Industrial NNP 3352 45 17 Classification Classification NNP 3352 45 18 ; ; : 3352 45 19 TL TL NNP 3352 45 20 205 205 CD 3352 45 21 in in IN 3352 45 22 the the DT 3352 45 23 Library Library NNP 3352 45 24 of of IN 3352 45 25 Congress Congress NNP 3352 45 26 ( ( -LRB- 3352 45 27 LC LC NNP 3352 45 28 ) ) -RRB- 3352 45 29 Classification Classification NNP 3352 45 30 , , , 3352 45 31 180/280 180/280 NNS 3352 45 32 in in IN 3352 45 33 the the DT 3352 45 34 U.S. U.S. NNP 3352 45 35 Patent Patent NNP 3352 45 36 Classification Classification NNP 3352 45 37 ; ; : 3352 45 38 and and CC 3352 45 39 , , , 3352 45 40 in in IN 3352 45 41 the the DT 3352 45 42 Census Census NNP 3352 45 43 Bureau Bureau NNP 3352 45 44 ’s ’s POS 3352 45 45 U.S. U.S. NNP 3352 45 46 Imports Imports NNPS 3352 45 47 and and CC 3352 45 48 Exports Exports NNPS 3352 45 49 data data NN 3352 45 50 series series NN 3352 45 51 , , , 3352 45 52 PASS PASS NNP 3352 45 53 MOT MOT NNP 3352 45 54 VEH VEH NNP 3352 45 55 , , , 3352 45 56 SPARK SPARK NNP 3352 45 57 IGN IGN NNP 3352 45 58 ENG.4 ENG.4 NNS 3352 45 59 2 2 CD 3352 45 60 . . . 3352 46 1 Transverse transverse JJ 3352 46 2 search search NN 3352 46 3 is be VBZ 3352 46 4 when when WRB 3352 46 5 an an DT 3352 46 6 item item NN 3352 46 7 of of IN 3352 46 8 interest interest NN 3352 46 9 found find VBN 3352 46 10 in in IN 3352 46 11 one one CD 3352 46 12 resource resource NN 3352 46 13 is be VBZ 3352 46 14 used use VBN 3352 46 15 as as IN 3352 46 16 the the DT 3352 46 17 basis basis NN 3352 46 18 for for IN 3352 46 19 a a DT 3352 46 20 query query NN 3352 46 21 to to TO 3352 46 22 be be VB 3352 46 23 forwarded forward VBN 3352 46 24 to to IN 3352 46 25 a a DT 3352 46 26 different different JJ 3352 46 27 resource resource NN 3352 46 28 . . . 3352 47 1 The the DT 3352 47 2 challenge challenge NN 3352 47 3 here here RB 3352 47 4 , , , 3352 47 5 again again RB 3352 47 6 , , , 3352 47 7 is be VBZ 3352 47 8 that that IN 3352 47 9 when when WRB 3352 47 10 a a DT 3352 47 11 query query NN 3352 47 12 using use VBG 3352 47 13 the the DT 3352 47 14 topical topical JJ 3352 47 15 metadata metadata NN 3352 47 16 in in IN 3352 47 17 one one CD 3352 47 18 resource resource NN 3352 47 19 needs need VBZ 3352 47 20 to to TO 3352 47 21 be be VB 3352 47 22 expressed express VBN 3352 47 23 in in IN 3352 47 24 the the DT 3352 47 25 vocabulary vocabulary NN 3352 47 26 of of IN 3352 47 27 the the DT 3352 47 28 target target NN 3352 47 29 resource resource NN 3352 47 30 , , , 3352 47 31 the the DT 3352 47 32 metadata metadata NN 3352 47 33 vocabularies vocabulary NNS 3352 47 34 in in IN 3352 47 35 the the DT 3352 47 36 two two CD 3352 47 37 resources resource NNS 3352 47 38 will will MD 3352 47 39 usually usually RB 3352 47 40 be be VB 3352 47 41 different different JJ 3352 47 42 from from IN 3352 47 43 each each DT 3352 47 44 other other JJ 3352 47 45 , , , 3352 47 46 and and CC 3352 47 47 , , , 3352 47 48 quite quite RB 3352 47 49 likely likely JJ 3352 47 50 , , , 3352 47 51 both both DT 3352 47 52 are be VBP 3352 47 53 unfamiliar unfamiliar JJ 3352 47 54 to to IN 3352 47 55 the the DT 3352 47 56 searcher searcher NN 3352 47 57 . . . 3352 48 1 When when WRB 3352 48 2 searching search VBG 3352 48 3 within within IN 3352 48 4 a a DT 3352 48 5 single single JJ 3352 48 6 media medium NNS 3352 48 7 form form NN 3352 48 8 , , , 3352 48 9 it -PRON- PRP 3352 48 10 may may MD 3352 48 11 be be VB 3352 48 12 possible possible JJ 3352 48 13 to to TO 3352 48 14 use use VB 3352 48 15 content content NN 3352 48 16 itself -PRON- PRP 3352 48 17 directly directly RB 3352 48 18 as as IN 3352 48 19 a a DT 3352 48 20 query query NN 3352 48 21 : : : 3352 48 22 A a DT 3352 48 23 frag- frag- JJ 3352 48 24 ment ment NN 3352 48 25 of of IN 3352 48 26 text text NN 3352 48 27 in in IN 3352 48 28 a a DT 3352 48 29 source source NN 3352 48 30 - - HYPH 3352 48 31 text text NN 3352 48 32 database database NN 3352 48 33 is be VBZ 3352 48 34 commonly commonly RB 3352 48 35 used use VBN 3352 48 36 as as IN 3352 48 37 a a DT 3352 48 38 query query NN 3352 48 39 in in IN 3352 48 40 a a DT 3352 48 41 target target NN 3352 48 42 - - HYPH 3352 48 43 text text NN 3352 48 44 database database NN 3352 48 45 . . . 3352 49 1 Similarly similarly RB 3352 49 2 , , , 3352 49 3 one one PRP 3352 49 4 might may MD 3352 49 5 start start VB 3352 49 6 with with IN 3352 49 7 an an DT 3352 49 8 image image NN 3352 49 9 and and CC 3352 49 10 seek seek VB 3352 49 11 images image NNS 3352 49 12 that that WDT 3352 49 13 are be VBP 3352 49 14 measur- measur- VBG 3352 49 15 ably ably RB 3352 49 16 similar similar JJ 3352 49 17 . . . 3352 50 1 However however RB 3352 50 2 , , , 3352 50 3 because because IN 3352 50 4 such such JJ 3352 50 5 direct direct JJ 3352 50 6 search search NN 3352 50 7 can can MD 3352 50 8 not not RB 3352 50 9 be be VB 3352 50 10 done do VBN 3352 50 11 when when WRB 3352 50 12 searching search VBG 3352 50 13 across across IN 3352 50 14 different different JJ 3352 50 15 media medium NNS 3352 50 16 forms form NNS 3352 50 17 , , , 3352 50 18 an an DT 3352 50 19 indirect indirect JJ 3352 50 20 approach approach NN 3352 50 21 relying rely VBG 3352 50 22 on on IN 3352 50 23 the the DT 3352 50 24 use use NN 3352 50 25 of of IN 3352 50 26 interpretive interpretive JJ 3352 50 27 representations representation NNS 3352 50 28 becomes become VBZ 3352 50 29 necessary necessary JJ 3352 50 30 . . . 3352 51 1 As as IN 3352 51 2 the the DT 3352 51 3 network network NN 3352 51 4 envi- envi- VBZ 3352 51 5 ronment ronment NN 3352 51 6 expands expand NNS 3352 51 7 , , , 3352 51 8 mapping map VBG 3352 51 9 between between IN 3352 51 10 vocabularies vocabulary NNS 3352 51 11 will will MD 3352 51 12 be be VB 3352 51 13 increasingly increasingly RB 3352 51 14 important important JJ 3352 51 15 . . . 3352 52 1 ■ ■ NFP 3352 52 2 Text text NN 3352 52 3 and and CC 3352 52 4 numeric numeric JJ 3352 52 5 resources resource NNS 3352 52 6 Text Text NNP 3352 52 7 resource resource NN 3352 52 8 A a DT 3352 52 9 library library NN 3352 52 10 catalog catalog NN 3352 52 11 — — : 3352 52 12 a a DT 3352 52 13 special special JJ 3352 52 14 case case NN 3352 52 15 of of IN 3352 52 16 text text NN 3352 52 17 file file NN 3352 52 18 — — : 3352 52 19 was be VBD 3352 52 20 chosen choose VBN 3352 52 21 for for IN 3352 52 22 use use NN 3352 52 23 as as IN 3352 52 24 a a DT 3352 52 25 text text NN 3352 52 26 file file NN 3352 52 27 rather rather RB 3352 52 28 than than IN 3352 52 29 a a DT 3352 52 30 corpus corpus NN 3352 52 31 of of IN 3352 52 32 “ " `` 3352 52 33 full full JJ 3352 52 34 text text NN 3352 52 35 . . . 3352 52 36 ” " '' 3352 52 37 The the DT 3352 52 38 reasons reason NNS 3352 52 39 were be VBD 3352 52 40 practical practical JJ 3352 52 41 : : : 3352 52 42 In in IN 3352 52 43 this this DT 3352 52 44 exploratory exploratory JJ 3352 52 45 investiga- investiga- NN 3352 52 46 tion tion NN 3352 52 47 , , , 3352 52 48 it -PRON- PRP 3352 52 49 was be VBD 3352 52 50 important important JJ 3352 52 51 to to TO 3352 52 52 start start VB 3352 52 53 with with IN 3352 52 54 resources resource NNS 3352 52 55 that that WDT 3352 52 56 had have VBD 3352 52 57 rich rich JJ 3352 52 58 metadata metadata NN 3352 52 59 ; ; : 3352 52 60 it -PRON- PRP 3352 52 61 needed need VBD 3352 52 62 to to TO 3352 52 63 be be VB 3352 52 64 a a DT 3352 52 65 resource resource NN 3352 52 66 that that WDT 3352 52 67 was be VBD 3352 52 68 sufficiently sufficiently RB 3352 52 69 controllable controllable JJ 3352 52 70 to to TO 3352 52 71 enable enable VB 3352 52 72 experimentation experimentation NN 3352 52 73 with with IN 3352 52 74 it -PRON- PRP 3352 52 75 . . . 3352 53 1 A a DT 3352 53 2 library library NN 3352 53 3 catalog catalog NN 3352 53 4 was be VBD 3352 53 5 in in IN 3352 53 6 the the DT 3352 53 7 spirit spirit NN 3352 53 8 of of IN 3352 53 9 the the DT 3352 53 10 project project NN 3352 53 11 in in IN 3352 53 12 that that IN 3352 53 13 it -PRON- PRP 3352 53 14 would would MD 3352 53 15 lead lead VB 3352 53 16 to to IN 3352 53 17 additional additional JJ 3352 53 18 text text NN 3352 53 19 resources resource NNS 3352 53 20 ; ; : 3352 53 21 and and CC 3352 53 22 a a DT 3352 53 23 suitable suitable JJ 3352 53 24 resource resource NN 3352 53 25 was be VBD 3352 53 26 available available JJ 3352 53 27 , , , 3352 53 28 which which WDT 3352 53 29 was be VBD 3352 53 30 intended intend VBN 3352 53 31 for for IN 3352 53 32 metadata metadata NN 3352 53 33 mapping mapping NN 3352 53 34 : : : 3352 53 35 a a DT 3352 53 36 set set NN 3352 53 37 of of IN 3352 53 38 several several JJ 3352 53 39 million million CD 3352 53 40 MARC MARC NNP 3352 53 41 records record NNS 3352 53 42 , , , 3352 53 43 derived derive VBN 3352 53 44 from from IN 3352 53 45 MELVYL MELVYL NNP 3352 53 46 , , , 3352 53 47 the the DT 3352 53 48 University University NNP 3352 53 49 of of IN 3352 53 50 California California NNP 3352 53 51 online online JJ 3352 53 52 library library NN 3352 53 53 catalog catalog NN 3352 53 54 . . . 3352 54 1 Socioeconomic socioeconomic JJ 3352 54 2 numeric numeric JJ 3352 54 3 data datum NNS 3352 54 4 set set VBN 3352 54 5 Initially initially RB 3352 54 6 , , , 3352 54 7 and and CC 3352 54 8 in in IN 3352 54 9 prior prior JJ 3352 54 10 work work NN 3352 54 11 , , , 3352 54 12 the the DT 3352 54 13 authors author NNS 3352 54 14 had have VBD 3352 54 15 worked work VBN 3352 54 16 on on IN 3352 54 17 access access NN 3352 54 18 to to IN 3352 54 19 U.S. U.S. NNP 3352 54 20 federal federal JJ 3352 54 21 data datum NNS 3352 54 22 series series NN 3352 54 23 , , , 3352 54 24 especially especially RB 3352 54 25 import import NN 3352 54 26 and and CC 3352 54 27 export export NN 3352 54 28 statistics statistic NNS 3352 54 29 and and CC 3352 54 30 county county NN 3352 54 31 business business NN 3352 54 32 reports report VBZ 3352 54 33 . . . 3352 55 1 Although although IN 3352 55 2 some some DT 3352 55 3 progress progress NN 3352 55 4 was be VBD 3352 55 5 made make VBN 3352 55 6 with with IN 3352 55 7 interfaces interface NNS 3352 55 8 to to IN 3352 55 9 these these DT 3352 55 10 data datum NNS 3352 55 11 series series NN 3352 55 12 , , , 3352 55 13 it -PRON- PRP 3352 55 14 became become VBD 3352 55 15 clear clear JJ 3352 55 16 that that IN 3352 55 17 the the DT 3352 55 18 investment investment NN 3352 55 19 needed need VBN 3352 55 20 to to TO 3352 55 21 craft craft VB 3352 55 22 interoperable interoperable JJ 3352 55 23 access access NN 3352 55 24 was be VBD 3352 55 25 high high JJ 3352 55 26 relative relative JJ 3352 55 27 to to IN 3352 55 28 the the DT 3352 55 29 available available JJ 3352 55 30 staff staff NN 3352 55 31 . . . 3352 56 1 Crafting craft VBG 3352 56 2 access access NN 3352 56 3 to to IN 3352 56 4 individual individual JJ 3352 56 5 data datum NNS 3352 56 6 series series NN 3352 56 7 did do VBD 3352 56 8 not not RB 3352 56 9 appear appear VB 3352 56 10 to to TO 3352 56 11 be be VB 3352 56 12 a a DT 3352 56 13 scalable scalable JJ 3352 56 14 way way NN 3352 56 15 to to TO 3352 56 16 demonstrate demonstrate VB 3352 56 17 variety variety NN 3352 56 18 within within IN 3352 56 19 the the DT 3352 56 20 authors author NNS 3352 56 21 ’ ’ POS 3352 56 22 limited limited JJ 3352 56 23 resources resource NNS 3352 56 24 , , , 3352 56 25 so so CC 3352 56 26 attention attention NN 3352 56 27 was be VBD 3352 56 28 turned turn VBN 3352 56 29 to to IN 3352 56 30 a a DT 3352 56 31 single single JJ 3352 56 32 collection collection NN 3352 56 33 comprising comprise VBG 3352 56 34 many many JJ 3352 56 35 diverse diverse JJ 3352 56 36 numeric numeric JJ 3352 56 37 tables table NNS 3352 56 38 , , , 3352 56 39 the the DT 3352 56 40 Counting Counting NNP 3352 56 41 California California NNP 3352 56 42 database.5 database.5 CD 3352 56 43 ■ ■ NFP 3352 56 44 Mapping map VBG 3352 56 45 topical topical JJ 3352 56 46 metadata metadata NN 3352 56 47 Well well RB 3352 56 48 - - HYPH 3352 56 49 edited edit VBN 3352 56 50 , , , 3352 56 51 high high JJ 3352 56 52 - - HYPH 3352 56 53 quality quality NN 3352 56 54 databases database NNS 3352 56 55 typically typically RB 3352 56 56 have have VBP 3352 56 57 topi- topi- XX 3352 56 58 cal cal NN 3352 56 59 metadata metadata NN 3352 56 60 expertly expertly RB 3352 56 61 assigned assign VBN 3352 56 62 from from IN 3352 56 63 a a DT 3352 56 64 vocabulary vocabulary NN 3352 56 65 ( ( -LRB- 3352 56 66 the- the- DT 3352 56 67 saurus saurus NN 3352 56 68 , , , 3352 56 69 classification classification NN 3352 56 70 , , , 3352 56 71 subject subject NN 3352 56 72 - - HYPH 3352 56 73 heading head VBG 3352 56 74 system system NN 3352 56 75 , , , 3352 56 76 or or CC 3352 56 77 set set NN 3352 56 78 of of IN 3352 56 79 SEARCH SEARCH NNS 3352 56 80 ACROSS acros VBN 3352 56 81 DIFFERENT DIFFERENT NNP 3352 56 82 MEDIA medium NNS 3352 56 83 | | CD 3352 56 84 BUCKLAND BUCKLAND NNP 3352 56 85 , , , 3352 56 86 CHEN CHEN NNP 3352 56 87 , , , 3352 56 88 GEY GEY NNP 3352 56 89 , , , 3352 56 90 AND and CC 3352 56 91 LARSON LARSON NNP 3352 56 92 183 183 CD 3352 56 93 categories category NNS 3352 56 94 ) ) -RRB- 3352 56 95 . . . 3352 57 1 But but CC 3352 57 2 there there EX 3352 57 3 is be VBZ 3352 57 4 a a DT 3352 57 5 Babel Babel NNP 3352 57 6 of of IN 3352 57 7 different different JJ 3352 57 8 vocabularies vocabulary NNS 3352 57 9 . . . 3352 58 1 Not not RB 3352 58 2 only only RB 3352 58 3 do do VBP 3352 58 4 the the DT 3352 58 5 names name NNS 3352 58 6 of of IN 3352 58 7 topics topic NNS 3352 58 8 vary vary VBP 3352 58 9 , , , 3352 58 10 but but CC 3352 58 11 the the DT 3352 58 12 underlying underlie VBG 3352 58 13 concepts concept NNS 3352 58 14 or or CC 3352 58 15 categories category NNS 3352 58 16 may may MD 3352 58 17 also also RB 3352 58 18 differ differ VB 3352 58 19 . . . 3352 59 1 Effective effective JJ 3352 59 2 searching searching NN 3352 59 3 requires require VBZ 3352 59 4 expert expert JJ 3352 59 5 familiarity familiarity NN 3352 59 6 with with IN 3352 59 7 a a DT 3352 59 8 system system NN 3352 59 9 ’s ’s POS 3352 59 10 vocabulary vocabulary NN 3352 59 11 ; ; : 3352 59 12 but but CC 3352 59 13 as as IN 3352 59 14 access access NN 3352 59 15 to to IN 3352 59 16 digital digital JJ 3352 59 17 resources resource NNS 3352 59 18 expands expand VBZ 3352 59 19 , , , 3352 59 20 the the DT 3352 59 21 diversity diversity NN 3352 59 22 of of IN 3352 59 23 vocabularies vocabulary NNS 3352 59 24 increases increase NNS 3352 59 25 and and CC 3352 59 26 accessible accessible JJ 3352 59 27 resources resource NNS 3352 59 28 are be VBP 3352 59 29 decreasingly decreasingly RB 3352 59 30 likely likely JJ 3352 59 31 to to TO 3352 59 32 use use VB 3352 59 33 vocabularies vocabulary NNS 3352 59 34 familiar familiar JJ 3352 59 35 to to IN 3352 59 36 any any DT 3352 59 37 individual individual JJ 3352 59 38 searcher searcher NN 3352 59 39 . . . 3352 60 1 The the DT 3352 60 2 best good JJS 3352 60 3 answer answer NN 3352 60 4 is be VBZ 3352 60 5 twofold twofold JJ 3352 60 6 : : : 3352 60 7 First first RB 3352 60 8 , , , 3352 60 9 it -PRON- PRP 3352 60 10 is be VBZ 3352 60 11 desirable desirable JJ 3352 60 12 to to TO 3352 60 13 have have VB 3352 60 14 an an DT 3352 60 15 index index NN 3352 60 16 ( ( -LRB- 3352 60 17 a a DT 3352 60 18 “ " `` 3352 60 19 mapping mapping NN 3352 60 20 ” " '' 3352 60 21 ) ) -RRB- 3352 60 22 from from IN 3352 60 23 the the DT 3352 60 24 natural natural JJ 3352 60 25 language language NN 3352 60 26 of of IN 3352 60 27 each each DT 3352 60 28 group group NN 3352 60 29 of of IN 3352 60 30 searchers searcher NNS 3352 60 31 to to IN 3352 60 32 the the DT 3352 60 33 entries entry NNS 3352 60 34 used use VBN 3352 60 35 in in IN 3352 60 36 each each DT 3352 60 37 metadata metadata NN 3352 60 38 vocabulary vocabulary NN 3352 60 39 . . . 3352 61 1 Such such PDT 3352 61 2 a a DT 3352 61 3 mapping mapping NN 3352 61 4 provides provide VBZ 3352 61 5 an an DT 3352 61 6 index index NN 3352 61 7 from from IN 3352 61 8 a a DT 3352 61 9 vocabulary vocabulary NN 3352 61 10 familiar familiar JJ 3352 61 11 to to IN 3352 61 12 the the DT 3352 61 13 searcher searcher NN 3352 61 14 to to IN 3352 61 15 the the DT 3352 61 16 vocabulary vocabulary NN 3352 61 17 used use VBN 3352 61 18 in in IN 3352 61 19 entries entry NNS 3352 61 20 of of IN 3352 61 21 the the DT 3352 61 22 target target NN 3352 61 23 system system NN 3352 61 24 and and CC 3352 61 25 so so RB 3352 61 26 is be VBZ 3352 61 27 called call VBN 3352 61 28 a a DT 3352 61 29 search search NN 3352 61 30 - - HYPH 3352 61 31 term term NN 3352 61 32 recommender recommender NN 3352 61 33 system system NN 3352 61 34 . . . 3352 62 1 ( ( -LRB- 3352 62 2 The the DT 3352 62 3 authors author NNS 3352 62 4 called call VBD 3352 62 5 it -PRON- PRP 3352 62 6 an an DT 3352 62 7 “ " `` 3352 62 8 entry entry NN 3352 62 9 - - HYPH 3352 62 10 vocabulary vocabulary NN 3352 62 11 index index NN 3352 62 12 , , , 3352 62 13 ” " '' 3352 62 14 or or CC 3352 62 15 EVI EVI NNP 3352 62 16 . . . 3352 62 17 ) ) -RRB- 3352 63 1 Dewey Dewey NNP 3352 63 2 ’s ’s POS 3352 63 3 “ " `` 3352 63 4 Relativ Relativ NNP 3352 63 5 Index Index NNP 3352 63 6 ” " '' 3352 63 7 to to IN 3352 63 8 his -PRON- PRP$ 3352 63 9 Decimal Decimal NNP 3352 63 10 Classification Classification NNP 3352 63 11 is be VBZ 3352 63 12 a a DT 3352 63 13 famil- famil- VBG 3352 63 14 iar iar NNP 3352 63 15 example example NN 3352 63 16 . . . 3352 64 1 When when WRB 3352 64 2 searching search VBG 3352 64 3 across across IN 3352 64 4 databases database NNS 3352 64 5 , , , 3352 64 6 one one CD 3352 64 7 also also RB 3352 64 8 wants want VBZ 3352 64 9 a a DT 3352 64 10 second second JJ 3352 64 11 kind kind NN 3352 64 12 of of IN 3352 64 13 mapping mapping NN 3352 64 14 : : : 3352 64 15 between between IN 3352 64 16 pairs pair NNS 3352 64 17 of of IN 3352 64 18 system system NN 3352 64 19 vocabularies vocabulary NNS 3352 64 20 . . . 3352 65 1 Unfortunately unfortunately RB 3352 65 2 , , , 3352 65 3 mappings mapping NNS 3352 65 4 between between IN 3352 65 5 different different JJ 3352 65 6 vocabularies vocabulary NNS 3352 65 7 are be VBP 3352 65 8 rare rare JJ 3352 65 9 , , , 3352 65 10 expensive expensive JJ 3352 65 11 , , , 3352 65 12 time time NN 3352 65 13 - - HYPH 3352 65 14 consuming consume VBG 3352 65 15 , , , 3352 65 16 and and CC 3352 65 17 hard hard JJ 3352 65 18 to to TO 3352 65 19 maintain maintain VB 3352 65 20 . . . 3352 66 1 ( ( -LRB- 3352 66 2 The the DT 3352 66 3 Unified Unified NNP 3352 66 4 Medical Medical NNP 3352 66 5 Language Language NNP 3352 66 6 System System NNP 3352 66 7 is be VBZ 3352 66 8 a a DT 3352 66 9 notable notable JJ 3352 66 10 example example NN 3352 66 11 . . . 3352 67 1 ) ) -RRB- 3352 67 2 6 6 CD 3352 67 3 It -PRON- PRP 3352 67 4 is be VBZ 3352 67 5 the the DT 3352 67 6 authors author NNS 3352 67 7 ’ ’ POS 3352 67 8 impression impression NN 3352 67 9 that that IN 3352 67 10 this this DT 3352 67 11 problem problem NN 3352 67 12 is be VBZ 3352 67 13 worse bad JJR 3352 67 14 in in IN 3352 67 15 searching search VBG 3352 67 16 across across IN 3352 67 17 different different JJ 3352 67 18 media medium NNS 3352 67 19 forms form NNS 3352 67 20 because because IN 3352 67 21 data data NN 3352 67 22 bases basis NNS 3352 67 23 in in IN 3352 67 24 different different JJ 3352 67 25 media medium NNS 3352 67 26 forms form NNS 3352 67 27 tend tend VBP 3352 67 28 to to TO 3352 67 29 be be VB 3352 67 30 created create VBN 3352 67 31 by by IN 3352 67 32 different different JJ 3352 67 33 communities community NNS 3352 67 34 , , , 3352 67 35 increasing increase VBG 3352 67 36 the the DT 3352 67 37 chances chance NNS 3352 67 38 that that IN 3352 67 39 they -PRON- PRP 3352 67 40 will will MD 3352 67 41 use use VB 3352 67 42 different different JJ 3352 67 43 categories category NNS 3352 67 44 , , , 3352 67 45 vocabularies vocabulary NNS 3352 67 46 , , , 3352 67 47 and and CC 3352 67 48 ways way NNS 3352 67 49 of of IN 3352 67 50 thinking thinking NN 3352 67 51 . . . 3352 68 1 Fortunately fortunately RB 3352 68 2 where where WRB 3352 68 3 data datum NNS 3352 68 4 containing contain VBG 3352 68 5 two two CD 3352 68 6 forms form NNS 3352 68 7 of of IN 3352 68 8 vocabulary vocabulary NN 3352 68 9 are be VBP 3352 68 10 available available JJ 3352 68 11 , , , 3352 68 12 they -PRON- PRP 3352 68 13 can can MD 3352 68 14 be be VB 3352 68 15 used use VBN 3352 68 16 as as IN 3352 68 17 training training NN 3352 68 18 sets set NNS 3352 68 19 for for IN 3352 68 20 statistical statistical JJ 3352 68 21 - - HYPH 3352 68 22 association association NN 3352 68 23 techniques technique NNS 3352 68 24 to to TO 3352 68 25 generate generate VB 3352 68 26 EVIs evi NNS 3352 68 27 auto- auto- XX 3352 68 28 matically matically RB 3352 68 29 , , , 3352 68 30 and and CC 3352 68 31 this this DT 3352 68 32 is be VBZ 3352 68 33 the the DT 3352 68 34 approach approach NN 3352 68 35 that that WDT 3352 68 36 was be VBD 3352 68 37 used use VBN 3352 68 38 . . . 3352 69 1 ( ( -LRB- 3352 69 2 More More JJR 3352 69 3 details detail NNS 3352 69 4 can can MD 3352 69 5 be be VB 3352 69 6 found find VBN 3352 69 7 in in IN 3352 69 8 the the DT 3352 69 9 appendix appendix NNP 3352 69 10 . . . 3352 69 11 ) ) -RRB- 3352 70 1 From from IN 3352 70 2 text text NN 3352 70 3 words word NNS 3352 70 4 to to IN 3352 70 5 Library Library NNP 3352 70 6 Subject Subject NNP 3352 70 7 Headings Headings NNPS 3352 70 8 An an DT 3352 70 9 EVI EVI NNP 3352 70 10 from from IN 3352 70 11 ordinary ordinary JJ 3352 70 12 English english JJ 3352 70 13 words word NNS 3352 70 14 to to IN 3352 70 15 Library Library NNP 3352 70 16 of of IN 3352 70 17 Congress Congress NNP 3352 70 18 Subject Subject NNP 3352 70 19 Headings Headings NNPS 3352 70 20 ( ( -LRB- 3352 70 21 LCSH LCSH NNP 3352 70 22 ) ) -RRB- 3352 70 23 was be VBD 3352 70 24 created create VBN 3352 70 25 by by IN 3352 70 26 taking take VBG 3352 70 27 catalog catalog NN 3352 70 28 records record NNS 3352 70 29 containing contain VBG 3352 70 30 at at RB 3352 70 31 least least RBS 3352 70 32 one one CD 3352 70 33 subject subject NN 3352 70 34 heading heading NN 3352 70 35 ( ( -LRB- 3352 70 36 6xx 6xx JJ 3352 70 37 field field NN 3352 70 38 in in IN 3352 70 39 the the DT 3352 70 40 MARC MARC NNP 3352 70 41 bibliographic bibliographic JJ 3352 70 42 format format NN 3352 70 43 ) ) -RRB- 3352 70 44 . . . 3352 71 1 From from IN 3352 71 2 each each DT 3352 71 3 of of IN 3352 71 4 the the DT 3352 71 5 4,246,510 4,246,510 CD 3352 71 6 records record NNS 3352 71 7 used use VBN 3352 71 8 , , , 3352 71 9 main main JJ 3352 71 10 subject subject JJ 3352 71 11 headings heading NNS 3352 71 12 were be VBD 3352 71 13 extracted extract VBN 3352 71 14 ( ( -LRB- 3352 71 15 subfield subfield NNP 3352 71 16 a a DT 3352 71 17 from from IN 3352 71 18 fields field NNS 3352 71 19 600 600 CD 3352 71 20 , , , 3352 71 21 610 610 CD 3352 71 22 , , , 3352 71 23 611 611 CD 3352 71 24 , , , 3352 71 25 630 630 CD 3352 71 26 , , , 3352 71 27 650 650 CD 3352 71 28 , , , 3352 71 29 and and CC 3352 71 30 651 651 CD 3352 71 31 ) ) -RRB- 3352 71 32 and and CC 3352 71 33 fields field NNS 3352 71 34 containing contain VBG 3352 71 35 text text NN 3352 71 36 : : : 3352 71 37 titles title NNS 3352 71 38 ( ( -LRB- 3352 71 39 245a 245a NNP 3352 71 40 ) ) -RRB- 3352 71 41 , , , 3352 71 42 subtitles subtitle NNS 3352 71 43 ( ( -LRB- 3352 71 44 245b 245b SYM 3352 71 45 ) ) -RRB- 3352 71 46 , , , 3352 71 47 and and CC 3352 71 48 summaries summary NNS 3352 71 49 describing describe VBG 3352 71 50 the the DT 3352 71 51 scope scope NN 3352 71 52 and and CC 3352 71 53 general general JJ 3352 71 54 content content NN 3352 71 55 of of IN 3352 71 56 the the DT 3352 71 57 material material NN 3352 71 58 ( ( -LRB- 3352 71 59 520a 520a NNPS 3352 71 60 ) ) -RRB- 3352 71 61 . . . 3352 72 1 The the DT 3352 72 2 underlying underlie VBG 3352 72 3 assump- assump- JJ 3352 72 4 tion tion NN 3352 72 5 is be VBZ 3352 72 6 that that IN 3352 72 7 for for IN 3352 72 8 each each DT 3352 72 9 record record NN 3352 72 10 , , , 3352 72 11 the the DT 3352 72 12 words word NNS 3352 72 13 in in IN 3352 72 14 the the DT 3352 72 15 “ " `` 3352 72 16 text text NN 3352 72 17 ” " '' 3352 72 18 fields field NNS 3352 72 19 ( ( -LRB- 3352 72 20 245a 245a NNS 3352 72 21 , , , 3352 72 22 b b NN 3352 72 23 and and CC 3352 72 24 520a 520a NNPS 3352 72 25 ) ) -RRB- 3352 72 26 tend tend VBP 3352 72 27 to to TO 3352 72 28 be be VB 3352 72 29 characteristic characteristic JJ 3352 72 30 of of IN 3352 72 31 discourse discourse NN 3352 72 32 on on IN 3352 72 33 the the DT 3352 72 34 subject subject NN 3352 72 35 ( ( -LRB- 3352 72 36 6xxa 6xxa NNP 3352 72 37 ) ) -RRB- 3352 72 38 . . . 3352 73 1 Two two CD 3352 73 2 examples example NNS 3352 73 3 , , , 3352 73 4 with with IN 3352 73 5 identifying identify VBG 3352 73 6 LCCNs lccn NNS 3352 73 7 in in IN 3352 73 8 the the DT 3352 73 9 < < XX 3352 73 10 001 001 CD 3352 73 11 > > JJ 3352 73 12 field field NN 3352 73 13 are be VBP 3352 73 14 : : : 3352 73 15 < < XX 3352 73 16 001>73180254 001>73180254 CD 3352 73 17 //r86 > XX 3352 73 19 < < XX 3352 73 20 245> > XX 3352 73 22 A a DT 3352 73 23 study study NN 3352 73 24 of of IN 3352 73 25 operant operant JJ 3352 73 26 conditioning condition VBG 3352 73 27 under under IN 3352 73 28 delayed delay VBN 3352 73 29 reinforcement reinforcement NN 3352 73 30 in in IN 3352 73 31 early early JJ 3352 73 32 infancy > XX 3352 73 34 < < XX 3352 73 35 650> > XX 3352 73 37 Infant Infant NNP 3352 73 38 psychology > XX 3352 73 40 < < XX 3352 73 41 650> > FW 3352 73 43 Operant operant JJ 3352 73 44 conditioning > XX 3352 73 46 < < XX 3352 73 47 001>73180255 001>73180255 NNP 3352 73 48 < < XX 3352 73 49 /001 /001 . 3352 73 50 > > XX 3352 73 51 < < XX 3352 73 52 245> > XX 3352 73 54 Reptilian Reptilian NNP 3352 73 55 disease > XX 3352 73 57 recognition recognition NN 3352 73 58 and and CC 3352 73 59 treatment > XX 3352 73 61 < < XX 3352 73 62 650> > NN 3352 73 64 Reptiles > XX 3352 73 66 Diseases > XX 3352 73 68 The the DT 3352 73 69 words word NNS 3352 73 70 in in IN 3352 73 71 the the DT 3352 73 72 text text NN 3352 73 73 fields field NNS 3352 73 74 ( ( -LRB- 3352 73 75 245a 245a NNS 3352 73 76 , , , 3352 73 77 245b 245b NNPS 3352 73 78 , , , 3352 73 79 and and CC 3352 73 80 520a 520a NNPS 3352 73 81 ) ) -RRB- 3352 73 82 were be VBD 3352 73 83 extracted extract VBN 3352 73 84 . . . 3352 74 1 Stop stop VB 3352 74 2 words word NNS 3352 74 3 were be VBD 3352 74 4 removed remove VBN 3352 74 5 and and CC 3352 74 6 the the DT 3352 74 7 remainder remainder NN 3352 74 8 normalized normalize VBD 3352 74 9 . . . 3352 75 1 Then then RB 3352 75 2 the the DT 3352 75 3 degree degree NN 3352 75 4 to to TO 3352 75 5 which which WDT 3352 75 6 each each DT 3352 75 7 word word NN 3352 75 8 is be VBZ 3352 75 9 asso- asso- RB 3352 75 10 ciated ciate VBN 3352 75 11 with with IN 3352 75 12 each each DT 3352 75 13 subject subject NN 3352 75 14 heading heading NN 3352 75 15 ( ( -LRB- 3352 75 16 by by IN 3352 75 17 co co NN 3352 75 18 - - NN 3352 75 19 occurring occur VBG 3352 75 20 in in IN 3352 75 21 the the DT 3352 75 22 same same JJ 3352 75 23 records record NNS 3352 75 24 ) ) -RRB- 3352 75 25 was be VBD 3352 75 26 computed compute VBN 3352 75 27 using use VBG 3352 75 28 a a DT 3352 75 29 maximum maximum JJ 3352 75 30 likelihood likelihood NN 3352 75 31 ratio ratio NN 3352 75 32 - - HYPH 3352 75 33 based base VBN 3352 75 34 measure measure NN 3352 75 35 . . . 3352 76 1 Natural natural JJ 3352 76 2 - - HYPH 3352 76 3 language language NN 3352 76 4 processing processing NN 3352 76 5 can can MD 3352 76 6 be be VB 3352 76 7 used use VBN 3352 76 8 to to TO 3352 76 9 identify identify VB 3352 76 10 adjective adjective JJ 3352 76 11 - - HYPH 3352 76 12 noun noun JJ 3352 76 13 phrases phrase NNS 3352 76 14 to to TO 3352 76 15 support support VB 3352 76 16 more more RBR 3352 76 17 precise precise JJ 3352 76 18 searching searching NN 3352 76 19 using use VBG 3352 76 20 phrases phrase NNS 3352 76 21 as as RB 3352 76 22 well well RB 3352 76 23 as as IN 3352 76 24 individual individual JJ 3352 76 25 words word NNS 3352 76 26 . . . 3352 77 1 A a DT 3352 77 2 very very RB 3352 77 3 large large JJ 3352 77 4 matrix matrix NN 3352 77 5 shows show VBZ 3352 77 6 the the DT 3352 77 7 association association NN 3352 77 8 of of IN 3352 77 9 each each DT 3352 77 10 text text NN 3352 77 11 word word NN 3352 77 12 ( ( -LRB- 3352 77 13 or or CC 3352 77 14 phrase phrase VB 3352 77 15 ) ) -RRB- 3352 77 16 with with IN 3352 77 17 each each DT 3352 77 18 subject subject JJ 3352 77 19 heading heading NN 3352 77 20 ; ; : 3352 77 21 so so CC 3352 77 22 , , , 3352 77 23 for for IN 3352 77 24 any any DT 3352 77 25 given give VBN 3352 77 26 word word NN 3352 77 27 ( ( -LRB- 3352 77 28 or or CC 3352 77 29 combination combination NN 3352 77 30 of of IN 3352 77 31 words word NNS 3352 77 32 ) ) -RRB- 3352 77 33 , , , 3352 77 34 a a DT 3352 77 35 list list NN 3352 77 36 of of IN 3352 77 37 the the DT 3352 77 38 most most RBS 3352 77 39 closely closely RB 3352 77 40 associated associate VBN 3352 77 41 headings heading NNS 3352 77 42 , , , 3352 77 43 ranked rank VBN 3352 77 44 by by IN 3352 77 45 degree degree NN 3352 77 46 of of IN 3352 77 47 association association NN 3352 77 48 , , , 3352 77 49 can can MD 3352 77 50 be be VB 3352 77 51 derived derive VBN 3352 77 52 from from IN 3352 77 53 the the DT 3352 77 54 matrix matrix NN 3352 77 55 . . . 3352 78 1 Queries query NNS 3352 78 2 A a DT 3352 78 3 query query NN 3352 78 4 , , , 3352 78 5 which which WDT 3352 78 6 can can MD 3352 78 7 be be VB 3352 78 8 a a DT 3352 78 9 single single JJ 3352 78 10 word word NN 3352 78 11 , , , 3352 78 12 a a DT 3352 78 13 phrase phrase NN 3352 78 14 , , , 3352 78 15 a a DT 3352 78 16 set set NN 3352 78 17 of of IN 3352 78 18 keywords keyword NNS 3352 78 19 , , , 3352 78 20 a a DT 3352 78 21 book book NN 3352 78 22 title title NN 3352 78 23 , , , 3352 78 24 and and CC 3352 78 25 so so RB 3352 78 26 on on RB 3352 78 27 , , , 3352 78 28 is be VBZ 3352 78 29 normalized normalized JJ 3352 78 30 in in IN 3352 78 31 the the DT 3352 78 32 same same JJ 3352 78 33 way way NN 3352 78 34 and and CC 3352 78 35 looked look VBD 3352 78 36 up up RP 3352 78 37 in in IN 3352 78 38 the the DT 3352 78 39 matrix matrix NN 3352 78 40 to to TO 3352 78 41 produce produce VB 3352 78 42 a a DT 3352 78 43 ranked rank VBN 3352 78 44 list list NN 3352 78 45 of of IN 3352 78 46 the the DT 3352 78 47 most most RBS 3352 78 48 closely closely RB 3352 78 49 associated associate VBN 3352 78 50 subject subject JJ 3352 78 51 headings heading NNS 3352 78 52 as as IN 3352 78 53 candidate candidate NN 3352 78 54 LCSH LCSH NNP 3352 78 55 search search NN 3352 78 56 terms term NNS 3352 78 57 . . . 3352 79 1 For for IN 3352 79 2 example example NN 3352 79 3 , , , 3352 79 4 entering enter VBG 3352 79 5 the the DT 3352 79 6 textual textual JJ 3352 79 7 query query NN 3352 79 8 words word NNS 3352 79 9 “ " `` 3352 79 10 Peanut Peanut NNP 3352 79 11 ” " '' 3352 79 12 and and CC 3352 79 13 “ " `` 3352 79 14 Butter Butter NNP 3352 79 15 ” " '' 3352 79 16 generates generate VBZ 3352 79 17 the the DT 3352 79 18 following follow VBG 3352 79 19 ranking ranking JJ 3352 79 20 list list NN 3352 79 21 of of IN 3352 79 22 LCSH LCSH NNP 3352 79 23 main main JJ 3352 79 24 headings heading NNS 3352 79 25 as as IN 3352 79 26 candi- candi- JJ 3352 79 27 dates date NNS 3352 79 28 for for IN 3352 79 29 searching search VBG 3352 79 30 : : : 3352 79 31 Rank Rank NNP 3352 79 32 LCSH LCSH NNP 3352 79 33 ( ( -LRB- 3352 79 34 subfield subfield NNP 3352 79 35 650a 650a NNPS 3352 79 36 ) ) -RRB- 3352 79 37 1 1 CD 3352 79 38 . . . 3352 80 1 Peanut peanut NN 3352 80 2 2 2 CD 3352 80 3 . . . 3352 81 1 Cookery cookery NN 3352 81 2 ( ( -LRB- 3352 81 3 peanut peanut NN 3352 81 4 butter butter NN 3352 81 5 ) ) -RRB- 3352 81 6 3 3 CD 3352 81 7 . . . 3352 82 1 Cookery cookery NN 3352 82 2 ( ( -LRB- 3352 82 3 peanuts peanut NNS 3352 82 4 ) ) -RRB- 3352 82 5 4 4 CD 3352 82 6 . . . 3352 83 1 Peanut peanut NN 3352 83 2 industry industry NN 3352 83 3 5 5 CD 3352 83 4 . . . 3352 84 1 Peanut peanut NN 3352 84 2 butter butter NN 3352 84 3 6 6 CD 3352 84 4 . . . 3352 85 1 Butter butter NN 3352 85 2 7 7 CD 3352 85 3 . . . 3352 86 1 Schulz Schulz NNP 3352 86 2 , , , 3352 86 3 Charles Charles NNP 3352 86 4 M. M. NNP 3352 86 5 This this DT 3352 86 6 display display NN 3352 86 7 is be VBZ 3352 86 8 an an DT 3352 86 9 important important JJ 3352 86 10 departure departure NN 3352 86 11 from from IN 3352 86 12 traditional traditional JJ 3352 86 13 fully fully RB 3352 86 14 automatic automatic JJ 3352 86 15 searching searching NN 3352 86 16 . . . 3352 87 1 The the DT 3352 87 2 list list NN 3352 87 3 is be VBZ 3352 87 4 , , , 3352 87 5 in in IN 3352 87 6 effect effect NN 3352 87 7 , , , 3352 87 8 a a DT 3352 87 9 prompt prompt NN 3352 87 10 , , , 3352 87 11 indicating indicate VBG 3352 87 12 probably probably RB 3352 87 13 suitable suitable JJ 3352 87 14 query query NN 3352 87 15 terms term NNS 3352 87 16 in in IN 3352 87 17 the the DT 3352 87 18 vocabulary vocabulary NN 3352 87 19 of of IN 3352 87 20 the the DT 3352 87 21 target target NN 3352 87 22 resource resource NN 3352 87 23 . . . 3352 88 1 It -PRON- PRP 3352 88 2 introduces introduce VBZ 3352 88 3 the the DT 3352 88 4 searcher searcher NN 3352 88 5 to to IN 3352 88 6 the the DT 3352 88 7 categories category NNS 3352 88 8 and and CC 3352 88 9 terminology terminology NN 3352 88 10 of of IN 3352 88 11 the the DT 3352 88 12 system system NN 3352 88 13 and and CC 3352 88 14 enables enable VBZ 3352 88 15 the the DT 3352 88 16 searcher searcher NN 3352 88 17 to to TO 3352 88 18 use use VB 3352 88 19 expert expert JJ 3352 88 20 judgment judgment NN 3352 88 21 to to TO 3352 88 22 select select VB 3352 88 23 the the DT 3352 88 24 heading heading NN 3352 88 25 that that WDT 3352 88 26 seems seem VBZ 3352 88 27 best good JJS 3352 88 28 for for IN 3352 88 29 the the DT 3352 88 30 search search NN 3352 88 31 . . . 3352 89 1 From from IN 3352 89 2 text text NN 3352 89 3 words word NNS 3352 89 4 to to IN 3352 89 5 the the DT 3352 89 6 metadata metadata NN 3352 89 7 vocabularies vocabulary NNS 3352 89 8 in in IN 3352 89 9 numeric numeric JJ 3352 89 10 data datum NNS 3352 89 11 sets set NNS 3352 89 12 A a DT 3352 89 13 training training NN 3352 89 14 set set NN 3352 89 15 of of IN 3352 89 16 records record NNS 3352 89 17 containing contain VBG 3352 89 18 both both DT 3352 89 19 descriptive descriptive JJ 3352 89 20 words word NNS 3352 89 21 and and CC 3352 89 22 topical topical JJ 3352 89 23 metadata metadata NN 3352 89 24 is be VBZ 3352 89 25 often often RB 3352 89 26 not not RB 3352 89 27 readily readily RB 3352 89 28 available available JJ 3352 89 29 for for IN 3352 89 30 numeric numeric JJ 3352 89 31 data datum NNS 3352 89 32 sets set NNS 3352 89 33 . . . 3352 90 1 The the DT 3352 90 2 authors author NNS 3352 90 3 ’ ’ POS 3352 90 4 first first JJ 3352 90 5 effort effort NN 3352 90 6 was be VBD 3352 90 7 to to TO 3352 90 8 create create VB 3352 90 9 an an DT 3352 90 10 EVI evi NN 3352 90 11 to to IN 3352 90 12 the the DT 3352 90 13 Standard Standard NNP 3352 90 14 Industrial Industrial NNP 3352 90 15 Classification Classification NNP 3352 90 16 ( ( -LRB- 3352 90 17 SIC SIC NNP 3352 90 18 ) ) -RRB- 3352 90 19 , , , 3352 90 20 widely widely RB 3352 90 21 used use VBN 3352 90 22 over over IN 3352 90 23 many many JJ 3352 90 24 years year NNS 3352 90 25 in in IN 3352 90 26 numeric numeric JJ 3352 90 27 data datum NNS 3352 90 28 sets set NNS 3352 90 29 . . . 3352 91 1 ( ( -LRB- 3352 91 2 SIC sic NN 3352 91 3 codes code NNS 3352 91 4 were be VBD 3352 91 5 associated associate VBN 3352 91 6 with with IN 3352 91 7 words word NNS 3352 91 8 by by IN 3352 91 9 using use VBG 3352 91 10 , , , 3352 91 11 as as IN 3352 91 12 a a DT 3352 91 13 training training NN 3352 91 14 set set NN 3352 91 15 , , , 3352 91 16 the the DT 3352 91 17 184 184 CD 3352 91 18 INFORMATION information NN 3352 91 19 TECHNOLOGY TECHNOLOGY NNP 3352 91 20 AND and CC 3352 91 21 LIBRARIES library NNS 3352 91 22 | | CD 3352 91 23 DECEMBER DECEMBER NNP 3352 91 24 2006 2006 CD 3352 91 25 titles title NNS 3352 91 26 in in IN 3352 91 27 a a DT 3352 91 28 bibliographic bibliographic JJ 3352 91 29 database database NN 3352 91 30 that that WDT 3352 91 31 used use VBD 3352 91 32 SIC sic NN 3352 91 33 codes code NNS 3352 91 34 . . . 3352 91 35 ) ) -RRB- 3352 92 1 But but CC 3352 92 2 by by IN 3352 92 3 the the DT 3352 92 4 time time NN 3352 92 5 the the DT 3352 92 6 SIC SIC NNP 3352 92 7 EVI EVI NNP 3352 92 8 was be VBD 3352 92 9 completed complete VBN 3352 92 10 , , , 3352 92 11 SIC SIC NNP 3352 92 12 had have VBD 3352 92 13 been be VBN 3352 92 14 dis- dis- RB 3352 92 15 continued continue VBN 3352 92 16 and and CC 3352 92 17 replaced replace VBN 3352 92 18 by by IN 3352 92 19 the the DT 3352 92 20 North North NNP 3352 92 21 American American NNP 3352 92 22 Industry Industry NNP 3352 92 23 Classification Classification NNP 3352 92 24 System System NNP 3352 92 25 ( ( -LRB- 3352 92 26 NAICS NAICS NNP 3352 92 27 ) ) -RRB- 3352 92 28 , , , 3352 92 29 so so CC 3352 92 30 a a DT 3352 92 31 mapping mapping NN 3352 92 32 was be VBD 3352 92 33 created create VBN 3352 92 34 from from IN 3352 92 35 SIC sic NN 3352 92 36 codes code NNS 3352 92 37 to to IN 3352 92 38 NAICS NAICS NNP 3352 92 39 codes code NNS 3352 92 40 . . . 3352 93 1 Figures figure NNS 3352 93 2 1–3 1–3 CD 3352 93 3 show show VBP 3352 93 4 stages stage NNS 3352 93 5 in in IN 3352 93 6 an an DT 3352 93 7 interface interface NN 3352 93 8 that that WDT 3352 93 9 accepts accept VBZ 3352 93 10 a a DT 3352 93 11 searcher searcher NN 3352 93 12 ’s ’s NN 3352 93 13 query query NN 3352 93 14 “ " `` 3352 93 15 car car NN 3352 93 16 ” " '' 3352 93 17 ( ( -LRB- 3352 93 18 figure figure NN 3352 93 19 1 1 CD 3352 93 20 ) ) -RRB- 3352 93 21 , , , 3352 93 22 prompts prompt VBZ 3352 93 23 with with IN 3352 93 24 a a DT 3352 93 25 ranked rank VBN 3352 93 26 list list NN 3352 93 27 of of IN 3352 93 28 NAICS NAICS NNP 3352 93 29 codes code NNS 3352 93 30 ( ( -LRB- 3352 93 31 figure figure NN 3352 93 32 2 2 CD 3352 93 33 ) ) -RRB- 3352 93 34 , , , 3352 93 35 then then RB 3352 93 36 extends extend VBZ 3352 93 37 the the DT 3352 93 38 search search NN 3352 93 39 with with IN 3352 93 40 the the DT 3352 93 41 selected select VBN 3352 93 42 NAICS NAICS NNP 3352 93 43 code code NN 3352 93 44 to to TO 3352 93 45 retrieve retrieve VB 3352 93 46 numeric numeric JJ 3352 93 47 data datum NNS 3352 93 48 ( ( -LRB- 3352 93 49 figure figure NN 3352 93 50 3 3 CD 3352 93 51 ) ) -RRB- 3352 93 52 . . . 3352 94 1 By by IN 3352 94 2 this this DT 3352 94 3 time time NN 3352 94 4 , , , 3352 94 5 however however RB 3352 94 6 , , , 3352 94 7 it -PRON- PRP 3352 94 8 had have VBD 3352 94 9 become become VBN 3352 94 10 apparent apparent JJ 3352 94 11 that that IN 3352 94 12 , , , 3352 94 13 with with IN 3352 94 14 the the DT 3352 94 15 current current JJ 3352 94 16 low low JJ 3352 94 17 level level NN 3352 94 18 of of IN 3352 94 19 interoperability interoperability NN 3352 94 20 in in IN 3352 94 21 software software NN 3352 94 22 and and CC 3352 94 23 in in IN 3352 94 24 data data NN 3352 94 25 formats format NNS 3352 94 26 , , , 3352 94 27 the the DT 3352 94 28 labor labor NN 3352 94 29 required require VBN 3352 94 30 to to TO 3352 94 31 create create VB 3352 94 32 EVIs evi NNS 3352 94 33 and and CC 3352 94 34 interfaces interface NNS 3352 94 35 to to IN 3352 94 36 each each DT 3352 94 37 large large JJ 3352 94 38 traditional traditional JJ 3352 94 39 numeric numeric NNP 3352 94 40 data data NNP 3352 94 41 series series NN 3352 94 42 was be VBD 3352 94 43 enormous enormous JJ 3352 94 44 . . . 3352 95 1 Therefore therefore RB 3352 95 2 , , , 3352 95 3 attention attention NN 3352 95 4 was be VBD 3352 95 5 turned turn VBN 3352 95 6 to to IN 3352 95 7 a a DT 3352 95 8 collection collection NN 3352 95 9 of of IN 3352 95 10 different different JJ 3352 95 11 numeric numeric JJ 3352 95 12 data data NN 3352 95 13 sets set NNS 3352 95 14 available available JJ 3352 95 15 through through IN 3352 95 16 a a DT 3352 95 17 single single JJ 3352 95 18 interface interface NN 3352 95 19 , , , 3352 95 20 Counting Counting NNP 3352 95 21 California California NNP 3352 95 22 , , , 3352 95 23 made make VBN 3352 95 24 available available JJ 3352 95 25 by by IN 3352 95 26 California California NNP 3352 95 27 Digital Digital NNP 3352 95 28 Library Library NNP 3352 95 29 at at IN 3352 95 30 http://countingcalifornia.cdlib.org http://countingcalifornia.cdlib.org NNP 3352 95 31 . . . 3352 96 1 This this DT 3352 96 2 resource resource NN 3352 96 3 is be VBZ 3352 96 4 a a DT 3352 96 5 collection collection NN 3352 96 6 of of IN 3352 96 7 some some DT 3352 96 8 three three CD 3352 96 9 thousand thousand CD 3352 96 10 numeric numeric JJ 3352 96 11 tables table NNS 3352 96 12 containing contain VBG 3352 96 13 statistics statistic NNS 3352 96 14 related relate VBN 3352 96 15 to to IN 3352 96 16 a a DT 3352 96 17 range range NN 3352 96 18 of of IN 3352 96 19 topics topic NNS 3352 96 20 . . . 3352 97 1 The the DT 3352 97 2 numeric numeric JJ 3352 97 3 data data NN 3352 97 4 sets set NNS 3352 97 5 are be VBP 3352 97 6 mainly mainly RB 3352 97 7 from from IN 3352 97 8 the the DT 3352 97 9 California California NNP 3352 97 10 Department Department NNP 3352 97 11 of of IN 3352 97 12 Health Health NNP 3352 97 13 Services Services NNPS 3352 97 14 , , , 3352 97 15 the the DT 3352 97 16 California California NNP 3352 97 17 Department Department NNP 3352 97 18 of of IN 3352 97 19 Finance Finance NNP 3352 97 20 , , , 3352 97 21 and and CC 3352 97 22 the the DT 3352 97 23 federal federal JJ 3352 97 24 Bureau Bureau NNP 3352 97 25 of of IN 3352 97 26 the the DT 3352 97 27 Census Census NNP 3352 97 28 . . . 3352 98 1 The the DT 3352 98 2 tables table NNS 3352 98 3 are be VBP 3352 98 4 organized organize VBN 3352 98 5 under under IN 3352 98 6 a a DT 3352 98 7 two two CD 3352 98 8 - - HYPH 3352 98 9 level level NN 3352 98 10 classification classification NN 3352 98 11 scheme scheme NN 3352 98 12 . . . 3352 99 1 There there EX 3352 99 2 are be VBP 3352 99 3 sixteen sixteen CD 3352 99 4 topics topic NNS 3352 99 5 at at IN 3352 99 6 the the DT 3352 99 7 top top JJ 3352 99 8 level level NN 3352 99 9 , , , 3352 99 10 which which WDT 3352 99 11 are be VBP 3352 99 12 subdi- subdi- NN 3352 99 13 vided vide VBD 3352 99 14 into into IN 3352 99 15 a a DT 3352 99 16 total total NN 3352 99 17 of of IN 3352 99 18 184 184 CD 3352 99 19 subtopics subtopic NNS 3352 99 20 . . . 3352 100 1 All all PDT 3352 100 2 the the DT 3352 100 3 numeric numeric JJ 3352 100 4 tables table NNS 3352 100 5 were be VBD 3352 100 6 assigned assign VBN 3352 100 7 to to IN 3352 100 8 one one CD 3352 100 9 or or CC 3352 100 10 more more JJR 3352 100 11 subtopics subtopic NNS 3352 100 12 and and CC 3352 100 13 each each DT 3352 100 14 table table NN 3352 100 15 has have VBZ 3352 100 16 a a DT 3352 100 17 caption caption NN 3352 100 18 . . . 3352 101 1 At at IN 3352 101 2 the the DT 3352 101 3 Counting Counting NNP 3352 101 4 California California NNP 3352 101 5 Web web NN 3352 101 6 site site NN 3352 101 7 , , , 3352 101 8 a a DT 3352 101 9 searcher searcher NN 3352 101 10 can can MD 3352 101 11 browse browse VB 3352 101 12 for for IN 3352 101 13 tables table NNS 3352 101 14 by by IN 3352 101 15 selecting select VBG 3352 101 16 a a DT 3352 101 17 higher high JJR 3352 101 18 - - HYPH 3352 101 19 level level NN 3352 101 20 topic topic NN 3352 101 21 , , , 3352 101 22 then then RB 3352 101 23 a a DT 3352 101 24 lower low JJR 3352 101 25 - - HYPH 3352 101 26 level level NN 3352 101 27 subtopic subtopic NN 3352 101 28 , , , 3352 101 29 and and CC 3352 101 30 then then RB 3352 101 31 a a DT 3352 101 32 table table NN 3352 101 33 . . . 3352 102 1 Two two CD 3352 102 2 additional additional JJ 3352 102 3 ways way NNS 3352 102 4 were be VBD 3352 102 5 created create VBN 3352 102 6 to to TO 3352 102 7 access access VB 3352 102 8 the the DT 3352 102 9 tables table NNS 3352 102 10 : : : 3352 102 11 Probabilistic probabilistic JJ 3352 102 12 retrieval retrieval NN 3352 102 13 , , , 3352 102 14 and and CC 3352 102 15 an an DT 3352 102 16 EVI evi NN 3352 102 17 to to IN 3352 102 18 the the DT 3352 102 19 topical topical JJ 3352 102 20 categories category NNS 3352 102 21 . . . 3352 103 1 The the DT 3352 103 2 cap- cap- JJ 3352 103 3 tions tion NNS 3352 103 4 , , , 3352 103 5 topics topic NNS 3352 103 6 , , , 3352 103 7 and and CC 3352 103 8 subtopics subtopic NNS 3352 103 9 were be VBD 3352 103 10 extracted extract VBN 3352 103 11 for for IN 3352 103 12 each each DT 3352 103 13 of of IN 3352 103 14 the the DT 3352 103 15 three three CD 3352 103 16 thousand thousand CD 3352 103 17 tables table NNS 3352 103 18 , , , 3352 103 19 and and CC 3352 103 20 XML xml NN 3352 103 21 records record NNS 3352 103 22 were be VBD 3352 103 23 created create VBN 3352 103 24 in in IN 3352 103 25 the the DT 3352 103 26 following follow VBG 3352 103 27 form form NN 3352 103 28 : : : 3352 103 29 < < XX 3352 103 30 table table NN 3352 103 31 > > XX 3352 103 32 < < XX 3352 103 33 topic topic NN 3352 103 34 > > XX 3352 103 35 education education NN 3352 103 36 < < XX 3352 103 37 /topic /topic . 3352 103 38 > > XX 3352 103 39 < < XX 3352 103 40 subtopic subtopic JJ 3352 103 41 > > XX 3352 103 42 libraries library NNS 3352 103 43 < < XX 3352 103 44 /subtopic /subtopic . 3352 103 45 > > XX 3352 103 46 < < XX 3352 103 47 caption caption NN 3352 103 48 > > XX 3352 103 49 library library NN 3352 103 50 statistics statistic NNS 3352 103 51 , , , 3352 103 52 statewide statewide JJ 3352 103 53 summary summary NN 3352 103 54 by by IN 3352 103 55 type type NN 3352 103 56 of of IN 3352 103 57 library library NN 3352 103 58 California California NNP 3352 103 59 1992–93 1992–93 CD 3352 103 60 to to IN 3352 103 61 1997–98 1997–98 CD 3352 103 62 < < XX 3352 103 63 /cap- /cap- . 3352 103 64 tion tion NN 3352 103 65 > > XX 3352 103 66 < < XX 3352 103 67 /table /table , 3352 103 68 > > XX 3352 103 69 Retrieval retrieval NN 3352 103 70 Two two CD 3352 103 71 search search NN 3352 103 72 methods method NNS 3352 103 73 were be VBD 3352 103 74 used use VBN 3352 103 75 : : : 3352 103 76 Direct Direct NNP 3352 103 77 Probabilistic Probabilistic NNP 3352 103 78 Retrieval Retrieval NNP 3352 103 79 . . . 3352 104 1 An an DT 3352 104 2 in in IN 3352 104 3 - - HYPH 3352 104 4 house house NN 3352 104 5 implementa- implementa- NNP 3352 104 6 tion tion NNP 3352 104 7 was be VBD 3352 104 8 used use VBN 3352 104 9 of of IN 3352 104 10 a a DT 3352 104 11 probabilistic probabilistic JJ 3352 104 12 full full JJ 3352 104 13 - - HYPH 3352 104 14 text text NN 3352 104 15 retrieval retrieval NN 3352 104 16 algo- algo- NNP 3352 104 17 rithm rithm NNP 3352 104 18 developed develop VBD 3352 104 19 at at IN 3352 104 20 Berkeley.7 Berkeley.7 NNP 3352 104 21 This this DT 3352 104 22 search search NN 3352 104 23 engine engine NN 3352 104 24 takes take VBZ 3352 104 25 a a DT 3352 104 26 free free JJ 3352 104 27 - - HYPH 3352 104 28 form form NN 3352 104 29 text text NN 3352 104 30 query query NN 3352 104 31 and and CC 3352 104 32 returns return VBZ 3352 104 33 a a DT 3352 104 34 ranked rank VBN 3352 104 35 list list NN 3352 104 36 of of IN 3352 104 37 captions caption NNS 3352 104 38 of of IN 3352 104 39 tables table NNS 3352 104 40 ranked rank VBD 3352 104 41 according accord VBG 3352 104 42 to to IN 3352 104 43 their -PRON- PRP$ 3352 104 44 relevance relevance NN 3352 104 45 scores score NNS 3352 104 46 . . . 3352 105 1 For for IN 3352 105 2 example example NN 3352 105 3 , , , 3352 105 4 the the DT 3352 105 5 five five CD 3352 105 6 top top JJ 3352 105 7 - - HYPH 3352 105 8 ranked rank VBN 3352 105 9 captions caption NNS 3352 105 10 returned return VBD 3352 105 11 to to IN 3352 105 12 the the DT 3352 105 13 query query NN 3352 105 14 “ " `` 3352 105 15 Public Public NNP 3352 105 16 Libraries Libraries NNPS 3352 105 17 in in IN 3352 105 18 California California NNP 3352 105 19 ” " '' 3352 105 20 were be VBD 3352 105 21 : : : 3352 105 22 Figure figure NN 3352 105 23 1 1 CD 3352 105 24 . . . 3352 106 1 Query query JJ 3352 106 2 interface interface NN 3352 106 3 for for IN 3352 106 4 search search NN 3352 106 5 - - HYPH 3352 106 6 term term NN 3352 106 7 recommender recommender NN 3352 106 8 system system NN 3352 106 9 f f NNP 3352 106 10 or or CC 3352 106 11 the the DT 3352 106 12 North North NNP 3352 106 13 American American NNP 3352 106 14 Industry Industry NNP 3352 106 15 Classification Classification NNP 3352 106 16 System System NNP 3352 106 17 Figure Figure NNP 3352 106 18 2 2 CD 3352 106 19 . . . 3352 107 1 Display display NN 3352 107 2 of of IN 3352 107 3 NAICS NAICS NNP 3352 107 4 code code NN 3352 107 5 search search NN 3352 107 6 - - HYPH 3352 107 7 term term NN 3352 107 8 recommendations recommendation NNS 3352 107 9 for for IN 3352 107 10 “ " `` 3352 107 11 car car NN 3352 107 12 ” " '' 3352 107 13 Figure figure NN 3352 107 14 3 3 CD 3352 107 15 . . . 3352 108 1 Display display NN 3352 108 2 of of IN 3352 108 3 numeric numeric JJ 3352 108 4 data datum NNS 3352 108 5 retrieved retrieve VBN 3352 108 6 using use VBG 3352 108 7 selected select VBN 3352 108 8 NAICS NAICS NNP 3352 108 9 code code NN 3352 108 10 SEARCH SEARCH NNS 3352 108 11 ACROSS across JJ 3352 108 12 DIFFERENT DIFFERENT NNP 3352 108 13 MEDIA medium NNS 3352 108 14 | | CD 3352 108 15 BUCKLAND BUCKLAND NNP 3352 108 16 , , , 3352 108 17 CHEN CHEN NNP 3352 108 18 , , , 3352 108 19 GEY GEY NNP 3352 108 20 , , , 3352 108 21 AND and CC 3352 108 22 LARSON LARSON NNP 3352 108 23 185 185 CD 3352 108 24 1 1 CD 3352 108 25 . . . 3352 109 1 Library library JJ 3352 109 2 statistics statistic NNS 3352 109 3 , , , 3352 109 4 Statewide statewide JJ 3352 109 5 summary summary NN 3352 109 6 by by IN 3352 109 7 type type NN 3352 109 8 of of IN 3352 109 9 library library NN 3352 109 10 California California NNP 3352 109 11 , , , 3352 109 12 1992–93 1992–93 CD 3352 109 13 to to TO 3352 109 14 1997–98 1997–98 CD 3352 109 15 Table table NN 3352 109 16 F6 f6 NN 3352 109 17 . . . 3352 110 1 2 2 LS 3352 110 2 . . . 3352 111 1 Library library JJ 3352 111 2 statistics statistic NNS 3352 111 3 , , , 3352 111 4 Statewide statewide JJ 3352 111 5 summary summary NN 3352 111 6 by by IN 3352 111 7 type type NN 3352 111 8 of of IN 3352 111 9 library library JJ 3352 111 10 California California NNP 3352 111 11 , , , 3352 111 12 1993–94 1993–94 CD 3352 111 13 to to IN 3352 111 14 1998–99 1998–99 CD 3352 111 15 Table Table NNP 3352 111 16 F6YR0 F6YR0 NNP 3352 111 17 - - HYPH 3352 111 18 0 0 NFP 3352 111 19 . . . 3352 112 1 3 3 LS 3352 112 2 . . . 3352 113 1 Number number NN 3352 113 2 of of IN 3352 113 3 California California NNP 3352 113 4 libraries library NNS 3352 113 5 , , , 3352 113 6 1989 1989 CD 3352 113 7 to to IN 3352 113 8 1999 1999 CD 3352 113 9 Table table NN 3352 113 10 F5YR00 f5yr00 NN 3352 113 11 4 4 CD 3352 113 12 . . . 3352 114 1 Number number NN 3352 114 2 of of IN 3352 114 3 California California NNP 3352 114 4 libraries library NNS 3352 114 5 , , , 3352 114 6 1989 1989 CD 3352 114 7 to to IN 3352 114 8 1998 1998 CD 3352 114 9 , , , 3352 114 10 as as IN 3352 114 11 of of IN 3352 114 12 September September NNP 3352 114 13 Table Table NNP 3352 114 14 F5 F5 NNP 3352 114 15 . . . 3352 115 1 5 5 CD 3352 115 2 . . . 3352 116 1 California California NNP 3352 116 2 Public Public NNP 3352 116 3 Schools Schools NNPS 3352 116 4 , , , 3352 116 5 Grades Grades NNP 3352 116 6 K–12 K–12 NNP 3352 116 7 , , , 3352 116 8 1989 1989 CD 3352 116 9 to to IN 3352 116 10 1998 1998 CD 3352 116 11 Table Table NNP 3352 116 12 F4 F4 NNP 3352 116 13 . . . 3352 117 1 Each each DT 3352 117 2 entry entry NN 3352 117 3 in in IN 3352 117 4 the the DT 3352 117 5 retrieved retrieve VBN 3352 117 6 set set NN 3352 117 7 list list NN 3352 117 8 is be VBZ 3352 117 9 linked link VBN 3352 117 10 to to IN 3352 117 11 a a DT 3352 117 12 numeric numeric JJ 3352 117 13 table table NN 3352 117 14 maintained maintain VBN 3352 117 15 at at IN 3352 117 16 the the DT 3352 117 17 Counting Counting NNP 3352 117 18 California California NNP 3352 117 19 Web web NN 3352 117 20 site site NN 3352 117 21 and and CC 3352 117 22 , , , 3352 117 23 by by IN 3352 117 24 clicking click VBG 3352 117 25 on on IN 3352 117 26 the the DT 3352 117 27 appropriate appropriate JJ 3352 117 28 link link NN 3352 117 29 , , , 3352 117 30 a a DT 3352 117 31 user user NN 3352 117 32 can can MD 3352 117 33 display display VB 3352 117 34 the the DT 3352 117 35 table table NN 3352 117 36 as as IN 3352 117 37 an an DT 3352 117 38 MS MS NNP 3352 117 39 Excel Excel NNP 3352 117 40 file file NN 3352 117 41 or or CC 3352 117 42 as as IN 3352 117 43 a a DT 3352 117 44 PDF PDF NNP 3352 117 45 file file NN 3352 117 46 . . . 3352 118 1 Mediated Mediated NNP 3352 118 2 Search Search NNP 3352 118 3 . . . 3352 119 1 From from IN 3352 119 2 the the DT 3352 119 3 same same JJ 3352 119 4 extracted extract VBN 3352 119 5 records record NNS 3352 119 6 the the DT 3352 119 7 words word NNS 3352 119 8 in in IN 3352 119 9 the the DT 3352 119 10 captions caption NNS 3352 119 11 were be VBD 3352 119 12 used use VBN 3352 119 13 to to TO 3352 119 14 create create VB 3352 119 15 an an DT 3352 119 16 EVI evi NN 3352 119 17 to to IN 3352 119 18 the the DT 3352 119 19 sub- sub- JJ 3352 119 20 topics topic NNS 3352 119 21 in in IN 3352 119 22 the the DT 3352 119 23 topic topic JJ 3352 119 24 classification classification NN 3352 119 25 using use VBG 3352 119 26 the the DT 3352 119 27 method method NN 3352 119 28 already already RB 3352 119 29 described describe VBN 3352 119 30 . . . 3352 120 1 As as IN 3352 120 2 an an DT 3352 120 3 example example NN 3352 120 4 , , , 3352 120 5 the the DT 3352 120 6 query query NN 3352 120 7 “ " `` 3352 120 8 personal personal JJ 3352 120 9 individual individual JJ 3352 120 10 income income NN 3352 120 11 tax tax NN 3352 120 12 , , , 3352 120 13 ” " '' 3352 120 14 when when WRB 3352 120 15 submitted submit VBN 3352 120 16 to to IN 3352 120 17 the the DT 3352 120 18 EVI EVI NNP 3352 120 19 , , , 3352 120 20 generated generate VBD 3352 120 21 the the DT 3352 120 22 following follow VBG 3352 120 23 ranked rank VBN 3352 120 24 list list NN 3352 120 25 of of IN 3352 120 26 subtopics subtopic NNS 3352 120 27 : : : 3352 120 28 1 1 CD 3352 120 29 . . . 3352 121 1 Income income NN 3352 121 2 2 2 CD 3352 121 3 . . . 3352 122 1 Government government NN 3352 122 2 earnings earning NNS 3352 122 3 and and CC 3352 122 4 tax tax NN 3352 122 5 revenues revenue NNS 3352 122 6 3 3 CD 3352 122 7 . . . 3352 123 1 Personal personal JJ 3352 123 2 income income NN 3352 123 3 4 4 CD 3352 123 4 . . . 3352 124 1 Property property NN 3352 124 2 tax tax NN 3352 124 3 5 5 CD 3352 124 4 . . . 3352 125 1 Personal personal JJ 3352 125 2 income income NN 3352 125 3 tax tax NN 3352 125 4 6 6 CD 3352 125 5 . . . 3352 126 1 Corporate corporate JJ 3352 126 2 income income NN 3352 126 3 tax tax NN 3352 126 4 7 7 CD 3352 126 5 . . . 3352 127 1 Per per IN 3352 127 2 capita capita NN 3352 127 3 income income NN 3352 127 4 A a DT 3352 127 5 user user NN 3352 127 6 can can MD 3352 127 7 click click VB 3352 127 8 on on IN 3352 127 9 any any DT 3352 127 10 selected select VBN 3352 127 11 subtopic subtopic NN 3352 127 12 to to TO 3352 127 13 retrieve retrieve VB 3352 127 14 the the DT 3352 127 15 cap- cap- JJ 3352 127 16 tions tion NNS 3352 127 17 of of IN 3352 127 18 tables table NNS 3352 127 19 assigned assign VBN 3352 127 20 that that DT 3352 127 21 subtopic subtopic NN 3352 127 22 . . . 3352 128 1 For for IN 3352 128 2 example example NN 3352 128 3 , , , 3352 128 4 clicking click VBG 3352 128 5 on on IN 3352 128 6 the the DT 3352 128 7 fifth fifth JJ 3352 128 8 subtopic subtopic NN 3352 128 9 , , , 3352 128 10 Personal personal JJ 3352 128 11 income income NN 3352 128 12 tax tax NN 3352 128 13 , , , 3352 128 14 retrieves retrieve VBZ 3352 128 15 : : : 3352 128 16 ■ ■ NFP 3352 128 17 Personal personal JJ 3352 128 18 income income NN 3352 128 19 tax tax NN 3352 128 20 returns return NNS 3352 128 21 : : : 3352 128 22 Number number NN 3352 128 23 and and CC 3352 128 24 amount amount NN 3352 128 25 of of IN 3352 128 26 adjusted adjust VBN 3352 128 27 gross gross JJ 3352 128 28 income income NN 3352 128 29 reported report VBN 3352 128 30 by by IN 3352 128 31 adjusted adjust VBN 3352 128 32 gross gross JJ 3352 128 33 income income NN 3352 128 34 class class NN 3352 128 35 California California NNP 3352 128 36 , , , 3352 128 37 1998 1998 CD 3352 128 38 taxable taxable JJ 3352 128 39 year year NN 3352 128 40 . . . 3352 129 1 Table table NN 3352 129 2 D10YR00 d10yr00 NN 3352 129 3 ■ ■ NFP 3352 129 4 Personal personal JJ 3352 129 5 income income NN 3352 129 6 tax tax NN 3352 129 7 returns return NNS 3352 129 8 : : : 3352 129 9 Number number NN 3352 129 10 and and CC 3352 129 11 amount amount NN 3352 129 12 of of IN 3352 129 13 adjusted adjust VBN 3352 129 14 gross gross JJ 3352 129 15 income income NN 3352 129 16 reported report VBN 3352 129 17 by by IN 3352 129 18 adjusted adjust VBN 3352 129 19 gross gross JJ 3352 129 20 income income NN 3352 129 21 class class NN 3352 129 22 California California NNP 3352 129 23 , , , 3352 129 24 1997 1997 CD 3352 129 25 taxable taxable JJ 3352 129 26 year year NN 3352 129 27 . . . 3352 130 1 Table table NN 3352 130 2 D9 d9 NN 3352 130 3 ■ ■ NFP 3352 130 4 Personal personal JJ 3352 130 5 income income NN 3352 130 6 statistics statistic NNS 3352 130 7 by by IN 3352 130 8 county county NN 3352 130 9 , , , 3352 130 10 California California NNP 3352 130 11 1997 1997 CD 3352 130 12 taxable taxable JJ 3352 130 13 year year NN 3352 130 14 . . . 3352 131 1 Table table NN 3352 131 2 D10 d10 NN 3352 131 3 ■ ■ NFP 3352 131 4 Personal personal JJ 3352 131 5 income income NN 3352 131 6 statistics statistic NNS 3352 131 7 by by IN 3352 131 8 county county NN 3352 131 9 , , , 3352 131 10 California California NNP 3352 131 11 1998 1998 CD 3352 131 12 taxable taxable JJ 3352 131 13 year year NN 3352 131 14 . . . 3352 132 1 Table table NN 3352 132 2 D11YR00 D11YR00 NNS 3352 132 3 ■ ■ NFP 3352 132 4 Transverse Transverse NNP 3352 132 5 searching search VBG 3352 132 6 between between IN 3352 132 7 text- text- NN 3352 132 8 and and CC 3352 132 9 numeric numeric JJ 3352 132 10 - - HYPH 3352 132 11 data datum NNS 3352 132 12 series series NN 3352 132 13 To to TO 3352 132 14 demonstrate demonstrate VB 3352 132 15 the the DT 3352 132 16 searching search VBG 3352 132 17 capability capability NN 3352 132 18 from from IN 3352 132 19 a a DT 3352 132 20 bib- bib- JJ 3352 132 21 liographic liographic JJ 3352 132 22 record record NN 3352 132 23 to to IN 3352 132 24 numeric numeric JJ 3352 132 25 - - HYPH 3352 132 26 data data NN 3352 132 27 sets set NNS 3352 132 28 , , , 3352 132 29 the the DT 3352 132 30 first first JJ 3352 132 31 step step NN 3352 132 32 is be VBZ 3352 132 33 to to TO 3352 132 34 retrieve retrieve VB 3352 132 35 and and CC 3352 132 36 display display VB 3352 132 37 a a DT 3352 132 38 bibliographic bibliographic JJ 3352 132 39 record record NN 3352 132 40 from from IN 3352 132 41 an an DT 3352 132 42 online online JJ 3352 132 43 catalog catalog NN 3352 132 44 . . . 3352 133 1 A a DT 3352 133 2 Web web NN 3352 133 3 - - HYPH 3352 133 4 based base VBN 3352 133 5 interface interface NN 3352 133 6 for for IN 3352 133 7 searching search VBG 3352 133 8 online online JJ 3352 133 9 catalogs catalog NNS 3352 133 10 was be VBD 3352 133 11 implemented implement VBN 3352 133 12 using use VBG 3352 133 13 an an DT 3352 133 14 in in IN 3352 133 15 - - HYPH 3352 133 16 house house NN 3352 133 17 implementation implementation NN 3352 133 18 of of IN 3352 133 19 the the DT 3352 133 20 Z39.50 Z39.50 NNP 3352 133 21 protocol protocol NNP 3352 133 22 . . . 3352 134 1 Besides besides IN 3352 134 2 the the DT 3352 134 3 Z39.50 Z39.50 NNP 3352 134 4 protocol protocol NN 3352 134 5 , , , 3352 134 6 an an DT 3352 134 7 important important JJ 3352 134 8 component component NN 3352 134 9 that that WDT 3352 134 10 makes make VBZ 3352 134 11 searching search VBG 3352 134 12 remote remote JJ 3352 134 13 online online JJ 3352 134 14 catalogs catalog NNS 3352 134 15 feasible feasible JJ 3352 134 16 is be VBZ 3352 134 17 the the DT 3352 134 18 gateway gateway NN 3352 134 19 between between IN 3352 134 20 the the DT 3352 134 21 HTTP http JJ 3352 134 22 ( ( -LRB- 3352 134 23 Hypertext Hypertext NNP 3352 134 24 Transfer Transfer NNP 3352 134 25 Protocol Protocol NNP 3352 134 26 ) ) -RRB- 3352 134 27 and and CC 3352 134 28 the the DT 3352 134 29 Z39.50 Z39.50 NNP 3352 134 30 protocol protocol NNP 3352 134 31 . . . 3352 135 1 While while IN 3352 135 2 HTTP HTTP NNP 3352 135 3 is be VBZ 3352 135 4 a a DT 3352 135 5 connectionless connectionless RB 3352 135 6 - - HYPH 3352 135 7 oriented orient VBN 3352 135 8 protocol protocol NN 3352 135 9 , , , 3352 135 10 the the DT 3352 135 11 Z39.50 z39.50 NN 3352 135 12 is be VBZ 3352 135 13 a a DT 3352 135 14 connec- connec- NN 3352 135 15 tion tion NN 3352 135 16 - - HYPH 3352 135 17 oriented orient VBN 3352 135 18 protocol protocol NN 3352 135 19 . . . 3352 136 1 The the DT 3352 136 2 gateway gateway NN 3352 136 3 maintains maintain VBZ 3352 136 4 connections connection NNS 3352 136 5 to to TO 3352 136 6 remote remote JJ 3352 136 7 Z39.50 z39.50 NN 3352 136 8 servers server NNS 3352 136 9 . . . 3352 137 1 All all DT 3352 137 2 search search NN 3352 137 3 requests request NNS 3352 137 4 to to IN 3352 137 5 any any DT 3352 137 6 remote remote JJ 3352 137 7 Z39.50 z39.50 NN 3352 137 8 server server NN 3352 137 9 go go VB 3352 137 10 through through IN 3352 137 11 the the DT 3352 137 12 gateway gateway NN 3352 137 13 . . . 3352 138 1 Searching search VBG 3352 138 2 from from IN 3352 138 3 catalog catalog NN 3352 138 4 records record NNS 3352 138 5 to to IN 3352 138 6 numeric numeric JJ 3352 138 7 data datum NNS 3352 138 8 sets set NNS 3352 138 9 Having have VBG 3352 138 10 selected select VBN 3352 138 11 some some DT 3352 138 12 text text NN 3352 138 13 ( ( -LRB- 3352 138 14 for for IN 3352 138 15 the the DT 3352 138 16 purposes purpose NNS 3352 138 17 of of IN 3352 138 18 this this DT 3352 138 19 study study NN 3352 138 20 , , , 3352 138 21 a a DT 3352 138 22 catalog catalog NN 3352 138 23 record record NN 3352 138 24 ) ) -RRB- 3352 138 25 , , , 3352 138 26 how how WRB 3352 138 27 could could MD 3352 138 28 one one CD 3352 138 29 identify identify VB 3352 138 30 the the DT 3352 138 31 facts fact NNS 3352 138 32 or or CC 3352 138 33 statis- statis- NN 3352 138 34 tics tic NNS 3352 138 35 in in IN 3352 138 36 a a DT 3352 138 37 numeric numeric JJ 3352 138 38 database database NN 3352 138 39 that that WDT 3352 138 40 are be VBP 3352 138 41 most most RBS 3352 138 42 closely closely RB 3352 138 43 related related JJ 3352 138 44 to to IN 3352 138 45 the the DT 3352 138 46 topic topic NN 3352 138 47 ? ? . 3352 139 1 Clicking click VBG 3352 139 2 on on IN 3352 139 3 a a DT 3352 139 4 “ " `` 3352 139 5 formulate formulate JJ 3352 139 6 query query NN 3352 139 7 ” " '' 3352 139 8 button button NN 3352 139 9 placed place VBN 3352 139 10 at at IN 3352 139 11 the the DT 3352 139 12 end end NN 3352 139 13 of of IN 3352 139 14 a a DT 3352 139 15 displayed display VBN 3352 139 16 full full JJ 3352 139 17 MARC MARC NNP 3352 139 18 record record NN 3352 139 19 creates create VBZ 3352 139 20 a a DT 3352 139 21 query query NN 3352 139 22 for for IN 3352 139 23 searching search VBG 3352 139 24 a a DT 3352 139 25 numeric numeric JJ 3352 139 26 database database NN 3352 139 27 . . . 3352 140 1 The the DT 3352 140 2 initial initial JJ 3352 140 3 query query NN 3352 140 4 will will MD 3352 140 5 contain contain VB 3352 140 6 the the DT 3352 140 7 words word NNS 3352 140 8 extracted extract VBN 3352 140 9 from from IN 3352 140 10 the the DT 3352 140 11 title title NN 3352 140 12 , , , 3352 140 13 subtitle subtitle NN 3352 140 14 , , , 3352 140 15 and and CC 3352 140 16 the the DT 3352 140 17 subject subject JJ 3352 140 18 headings heading NNS 3352 140 19 and and CC 3352 140 20 is be VBZ 3352 140 21 placed place VBN 3352 140 22 in in IN 3352 140 23 a a DT 3352 140 24 new new JJ 3352 140 25 window window NN 3352 140 26 where where WRB 3352 140 27 the the DT 3352 140 28 user user NN 3352 140 29 can can MD 3352 140 30 modify modify VB 3352 140 31 or or CC 3352 140 32 expand expand VB 3352 140 33 the the DT 3352 140 34 query query NN 3352 140 35 before before IN 3352 140 36 submitting submit VBG 3352 140 37 it -PRON- PRP 3352 140 38 to to IN 3352 140 39 the the DT 3352 140 40 search search NN 3352 140 41 engine engine NN 3352 140 42 for for IN 3352 140 43 a a DT 3352 140 44 numeric numeric JJ 3352 140 45 database database NN 3352 140 46 . . . 3352 141 1 So so RB 3352 141 2 , , , 3352 141 3 for for IN 3352 141 4 example example NN 3352 141 5 , , , 3352 141 6 the the DT 3352 141 7 following follow VBG 3352 141 8 text text NN 3352 141 9 extracted extract VBD 3352 141 10 from from IN 3352 141 11 a a DT 3352 141 12 catalog catalog NN 3352 141 13 record record NN 3352 141 14 : : : 3352 141 15 Library library JJ 3352 141 16 laws law NNS 3352 141 17 of of IN 3352 141 18 the the DT 3352 141 19 State State NNP 3352 141 20 of of IN 3352 141 21 California California NNP 3352 141 22 , , , 3352 141 23 Library Library NNP 3352 141 24 legislation legislation NN 3352 141 25 . . . 3352 142 1 California California NNP 3352 142 2 . . . 3352 143 1 Public public JJ 3352 143 2 libraries library NNS 3352 143 3 when when WRB 3352 143 4 submitted submit VBN 3352 143 5 as as IN 3352 143 6 a a DT 3352 143 7 query query NN 3352 143 8 , , , 3352 143 9 retrieves retrieve VBZ 3352 143 10 a a DT 3352 143 11 ranked rank VBN 3352 143 12 list list NN 3352 143 13 of of IN 3352 143 14 table table NN 3352 143 15 names name NNS 3352 143 16 , , , 3352 143 17 of of IN 3352 143 18 which which WDT 3352 143 19 two two CD 3352 143 20 , , , 3352 143 21 covering cover VBG 3352 143 22 different different JJ 3352 143 23 time time NN 3352 143 24 periods period NNS 3352 143 25 , , , 3352 143 26 are be VBP 3352 143 27 entitled entitle VBN 3352 143 28 Library Library NNP 3352 143 29 Statistics Statistics NNPS 3352 143 30 , , , 3352 143 31 Statewide Statewide NNP 3352 143 32 Summary Summary NNP 3352 143 33 by by IN 3352 143 34 Type Type NNP 3352 143 35 of of IN 3352 143 36 Library Library NNP 3352 143 37 , , , 3352 143 38 California California NNP 3352 143 39 . . . 3352 144 1 Searching search VBG 3352 144 2 from from IN 3352 144 3 numeric numeric JJ 3352 144 4 data data NN 3352 144 5 sets set NNS 3352 144 6 from from IN 3352 144 7 catalog catalog NN 3352 144 8 records record NNS 3352 144 9 Transverse Transverse NNP 3352 144 10 search search NN 3352 144 11 in in IN 3352 144 12 the the DT 3352 144 13 other other JJ 3352 144 14 direction direction NN 3352 144 15 , , , 3352 144 16 starting start VBG 3352 144 17 from from IN 3352 144 18 a a DT 3352 144 19 data data NN 3352 144 20 table table NN 3352 144 21 , , , 3352 144 22 is be VBZ 3352 144 23 achieved achieve VBN 3352 144 24 by by IN 3352 144 25 forwarding forward VBG 3352 144 26 the the DT 3352 144 27 caption caption NN 3352 144 28 of of IN 3352 144 29 a a DT 3352 144 30 table table NN 3352 144 31 to to IN 3352 144 32 the the DT 3352 144 33 word word NN 3352 144 34 - - HYPH 3352 144 35 to to IN 3352 144 36 - - HYPH 3352 144 37 LCSH LCSH NNP 3352 144 38 EVI EVI NNP 3352 144 39 to to TO 3352 144 40 generate generate VB 3352 144 41 a a DT 3352 144 42 prompt prompt JJ 3352 144 43 list list NN 3352 144 44 of of IN 3352 144 45 the the DT 3352 144 46 seven seven CD 3352 144 47 top top RB 3352 144 48 - - HYPH 3352 144 49 ranked rank VBN 3352 144 50 LCHSs lchs NNS 3352 144 51 , , , 3352 144 52 any any DT 3352 144 53 one one CD 3352 144 54 of of IN 3352 144 55 which which WDT 3352 144 56 can can MD 3352 144 57 be be VB 3352 144 58 used use VBN 3352 144 59 as as IN 3352 144 60 a a DT 3352 144 61 query query NN 3352 144 62 submitted submit VBN 3352 144 63 to to IN 3352 144 64 the the DT 3352 144 65 catalog catalog NN 3352 144 66 . . . 3352 145 1 ■ ■ NFP 3352 145 2 Architecture Architecture NNP 3352 145 3 Figure Figure NNP 3352 145 4 4 4 CD 3352 145 5 shows show VBZ 3352 145 6 the the DT 3352 145 7 structure structure NN 3352 145 8 of of IN 3352 145 9 the the DT 3352 145 10 implementation implementation NN 3352 145 11 . . . 3352 146 1 The the DT 3352 146 2 boxes box NNS 3352 146 3 shown show VBN 3352 146 4 in in IN 3352 146 5 the the DT 3352 146 6 figure figure NN 3352 146 7 are be VBP 3352 146 8 : : : 3352 146 9 1 1 LS 3352 146 10 . . . 3352 147 1 A a DT 3352 147 2 search search NN 3352 147 3 interface interface NN 3352 147 4 for for IN 3352 147 5 accessing access VBG 3352 147 6 bibliographic bibliographic NN 3352 147 7 / / SYM 3352 147 8 tex- tex- NN 3352 147 9 tual tual JJ 3352 147 10 resources resource NNS 3352 147 11 through through IN 3352 147 12 a a DT 3352 147 13 word word NN 3352 147 14 - - HYPH 3352 147 15 to to IN 3352 147 16 - - HYPH 3352 147 17 LCSH LCSH NNP 3352 147 18 EVI EVI NNP 3352 147 19 . . . 3352 148 1 2 2 LS 3352 148 2 . . . 3352 149 1 A a DT 3352 149 2 word word NN 3352 149 3 to to IN 3352 149 4 the the DT 3352 149 5 LCSH LCSH NNP 3352 149 6 EVI EVI NNP 3352 149 7 . . . 3352 150 1 3 3 LS 3352 150 2 . . . 3352 151 1 A a DT 3352 151 2 ranked ranked JJ 3352 151 3 list list NN 3352 151 4 of of IN 3352 151 5 LCSHs LCSHs NNPS 3352 151 6 closely closely RB 3352 151 7 associated associate VBN 3352 151 8 with with IN 3352 151 9 the the DT 3352 151 10 query query NN 3352 151 11 . . . 3352 152 1 4 4 LS 3352 152 2 . . . 3352 153 1 An an DT 3352 153 2 online online JJ 3352 153 3 catalog catalog NN 3352 153 4 . . . 3352 154 1 186 186 CD 3352 154 2 INFORMATION information NN 3352 154 3 TECHNOLOGY technology NN 3352 154 4 AND and CC 3352 154 5 LIBRARIES library NNS 3352 154 6 | | CD 3352 154 7 DECEMBER DECEMBER NNP 3352 154 8 2006 2006 CD 3352 154 9 5 5 CD 3352 154 10 . . . 3352 155 1 Results result NNS 3352 155 2 of of IN 3352 155 3 searching search VBG 3352 155 4 the the DT 3352 155 5 online online JJ 3352 155 6 catalog catalog NN 3352 155 7 using use VBG 3352 155 8 an an DT 3352 155 9 LCSH LCSH NNP 3352 155 10 . . . 3352 156 1 6 6 CD 3352 156 2 . . . 3352 157 1 A a DT 3352 157 2 full full JJ 3352 157 3 MARC MARC NNP 3352 157 4 record record NN 3352 157 5 displayed display VBN 3352 157 6 in in IN 3352 157 7 tagged tag VBN 3352 157 8 form form NN 3352 157 9 . . . 3352 158 1 7 7 LS 3352 158 2 . . . 3352 159 1 A a DT 3352 159 2 new new JJ 3352 159 3 query query NN 3352 159 4 formed form VBN 3352 159 5 by by IN 3352 159 6 extracting extract VBG 3352 159 7 the the DT 3352 159 8 title title NN 3352 159 9 and and CC 3352 159 10 sub- sub- JJ 3352 159 11 ject ject NN 3352 159 12 fields field NNS 3352 159 13 from from IN 3352 159 14 the the DT 3352 159 15 displayed display VBN 3352 159 16 full full JJ 3352 159 17 MARC MARC NNP 3352 159 18 record record NN 3352 159 19 . . . 3352 160 1 8 8 LS 3352 160 2 . . . 3352 161 1 A a DT 3352 161 2 numeric numeric JJ 3352 161 3 database database NN 3352 161 4 . . . 3352 162 1 9 9 CD 3352 162 2 . . . 3352 163 1 A a DT 3352 163 2 list list NN 3352 163 3 of of IN 3352 163 4 captions caption NNS 3352 163 5 of of IN 3352 163 6 numeric numeric JJ 3352 163 7 tables table NNS 3352 163 8 ranked rank VBN 3352 163 9 by by IN 3352 163 10 rel- rel- NNP 3352 163 11 evance evance NN 3352 163 12 score score NN 3352 163 13 to to IN 3352 163 14 the the DT 3352 163 15 query query NN 3352 163 16 . . . 3352 164 1 1 1 CD 3352 164 2 0 0 CD 3352 164 3 . . . 3352 165 1 Numeric numeric JJ 3352 165 2 table table NN 3352 165 3 displayed display VBN 3352 165 4 in in IN 3352 165 5 PDF PDF NNP 3352 165 6 or or CC 3352 165 7 MS MS NNP 3352 165 8 Excel Excel NNP 3352 165 9 for- for- XX 3352 165 10 mat mat NN 3352 165 11 . . . 3352 166 1 11 11 CD 3352 166 2 . . . 3352 167 1 A a DT 3352 167 2 search search NN 3352 167 3 interface interface NN 3352 167 4 for for IN 3352 167 5 numeric numeric JJ 3352 167 6 databases database NNS 3352 167 7 based base VBN 3352 167 8 on on IN 3352 167 9 a a DT 3352 167 10 probabilistic probabilistic JJ 3352 167 11 search search NN 3352 167 12 algorithm algorithm NN 3352 167 13 . . . 3352 168 1 A a DT 3352 168 2 user user NN 3352 168 3 can can MD 3352 168 4 start start VB 3352 168 5 a a DT 3352 168 6 search search NN 3352 168 7 using use VBG 3352 168 8 either either CC 3352 168 9 interface interface NN 3352 168 10 ( ( -LRB- 3352 168 11 boxes box NNS 3352 168 12 1 1 CD 3352 168 13 or or CC 3352 168 14 11 11 CD 3352 168 15 ) ) -RRB- 3352 168 16 and and CC 3352 168 17 , , , 3352 168 18 from from IN 3352 168 19 either either CC 3352 168 20 starting starting NN 3352 168 21 point point NN 3352 168 22 , , , 3352 168 23 find find VB 3352 168 24 records record NNS 3352 168 25 on on IN 3352 168 26 the the DT 3352 168 27 same same JJ 3352 168 28 topic topic NN 3352 168 29 of of IN 3352 168 30 interest interest NN 3352 168 31 in in IN 3352 168 32 a a DT 3352 168 33 textual textual NN 3352 168 34 ( ( -LRB- 3352 168 35 here here RB 3352 168 36 bibliographic bibliographic JJ 3352 168 37 ) ) -RRB- 3352 168 38 database database NN 3352 168 39 and and CC 3352 168 40 a a DT 3352 168 41 socioeconomic socioeconomic JJ 3352 168 42 database database NN 3352 168 43 . . . 3352 169 1 ■ ■ NFP 3352 169 2 Conclusions conclusion NNS 3352 169 3 and and CC 3352 169 4 further further JJ 3352 169 5 work work NN 3352 169 6 Enhanced enhance VBN 3352 169 7 access access NN 3352 169 8 to to IN 3352 169 9 numeric numeric JJ 3352 169 10 data datum NNS 3352 169 11 sets set NNS 3352 169 12 The the DT 3352 169 13 descriptive descriptive JJ 3352 169 14 texts text NNS 3352 169 15 associated associate VBN 3352 169 16 with with IN 3352 169 17 numeric numeric JJ 3352 169 18 tables table NNS 3352 169 19 , , , 3352 169 20 such such JJ 3352 169 21 as as IN 3352 169 22 the the DT 3352 169 23 caption caption NN 3352 169 24 , , , 3352 169 25 headers header NNS 3352 169 26 , , , 3352 169 27 or or CC 3352 169 28 row row NN 3352 169 29 labels label NNS 3352 169 30 , , , 3352 169 31 are be VBP 3352 169 32 usually usually RB 3352 169 33 very very RB 3352 169 34 short short JJ 3352 169 35 . . . 3352 170 1 They -PRON- PRP 3352 170 2 provide provide VBP 3352 170 3 a a DT 3352 170 4 rather rather RB 3352 170 5 limited limited JJ 3352 170 6 basis basis NN 3352 170 7 for for IN 3352 170 8 locating locate VBG 3352 170 9 the the DT 3352 170 10 table table NN 3352 170 11 in in IN 3352 170 12 response response NN 3352 170 13 to to IN 3352 170 14 queries query NNS 3352 170 15 , , , 3352 170 16 or or CC 3352 170 17 describing describe VBG 3352 170 18 a a DT 3352 170 19 data data NN 3352 170 20 cell cell NN 3352 170 21 sufficiently sufficiently RB 3352 170 22 to to TO 3352 170 23 form form VB 3352 170 24 a a DT 3352 170 25 usefully usefully RB 3352 170 26 descriptive descriptive JJ 3352 170 27 query query NN 3352 170 28 from from IN 3352 170 29 it -PRON- PRP 3352 170 30 . . . 3352 171 1 Sometimes sometimes RB 3352 171 2 the the DT 3352 171 3 title title NN 3352 171 4 ( ( -LRB- 3352 171 5 caption caption NN 3352 171 6 ) ) -RRB- 3352 171 7 of of IN 3352 171 8 a a DT 3352 171 9 table table NN 3352 171 10 may may MD 3352 171 11 be be VB 3352 171 12 the the DT 3352 171 13 only only JJ 3352 171 14 searchable searchable JJ 3352 171 15 textual textual JJ 3352 171 16 description description NN 3352 171 17 about about IN 3352 171 18 the the DT 3352 171 19 content content NN 3352 171 20 of of IN 3352 171 21 the the DT 3352 171 22 table table NN 3352 171 23 , , , 3352 171 24 and and CC 3352 171 25 the the DT 3352 171 26 titles title NNS 3352 171 27 are be VBP 3352 171 28 sometimes sometimes RB 3352 171 29 very very RB 3352 171 30 general general JJ 3352 171 31 . . . 3352 172 1 For for IN 3352 172 2 example example NN 3352 172 3 , , , 3352 172 4 one one CD 3352 172 5 of of IN 3352 172 6 the the DT 3352 172 7 titles title NNS 3352 172 8 , , , 3352 172 9 Library Library NNP 3352 172 10 Statistics Statistics NNPS 3352 172 11 , , , 3352 172 12 Statewide Statewide NNP 3352 172 13 Summary Summary NNP 3352 172 14 by by IN 3352 172 15 Type Type NNP 3352 172 16 of of IN 3352 172 17 Library Library NNP 3352 172 18 California California NNP 3352 172 19 , , , 3352 172 20 1992–93 1992–93 CD 3352 172 21 to to IN 3352 172 22 1997–98 1997–98 CD 3352 172 23 , , , 3352 172 24 is be VBZ 3352 172 25 so so RB 3352 172 26 general general JJ 3352 172 27 that that IN 3352 172 28 neither neither CC 3352 172 29 the the DT 3352 172 30 kinds kind NNS 3352 172 31 of of IN 3352 172 32 statistics statistic NNS 3352 172 33 nor nor CC 3352 172 34 the the DT 3352 172 35 types type NNS 3352 172 36 of of IN 3352 172 37 libraries library NNS 3352 172 38 are be VBP 3352 172 39 revealed reveal VBN 3352 172 40 . . . 3352 173 1 If if IN 3352 173 2 a a DT 3352 173 3 user user NN 3352 173 4 posed pose VBD 3352 173 5 the the DT 3352 173 6 question question NN 3352 173 7 , , , 3352 173 8 “ " `` 3352 173 9 What what WP 3352 173 10 are be VBP 3352 173 11 the the DT 3352 173 12 total total JJ 3352 173 13 operating operate VBG 3352 173 14 expenditures expenditure NNS 3352 173 15 of of IN 3352 173 16 public public JJ 3352 173 17 libraries library NNS 3352 173 18 in in IN 3352 173 19 California California NNP 3352 173 20 ? ? . 3352 173 21 ” " '' 3352 173 22 to to IN 3352 173 23 a a DT 3352 173 24 query query NN 3352 173 25 system system NN 3352 173 26 that that WDT 3352 173 27 indexes index VBZ 3352 173 28 table table VBP 3352 173 29 titles title NNS 3352 173 30 only only RB 3352 173 31 , , , 3352 173 32 the the DT 3352 173 33 search search NN 3352 173 34 may may MD 3352 173 35 well well RB 3352 173 36 be be VB 3352 173 37 ineffective ineffective JJ 3352 173 38 since since IN 3352 173 39 the the DT 3352 173 40 only only JJ 3352 173 41 word word NN 3352 173 42 in in IN 3352 173 43 common common JJ 3352 173 44 between between IN 3352 173 45 the the DT 3352 173 46 table table NN 3352 173 47 title title NN 3352 173 48 and and CC 3352 173 49 the the DT 3352 173 50 user user NN 3352 173 51 ’s ’s POS 3352 173 52 query query NN 3352 173 53 is be VBZ 3352 173 54 “ " `` 3352 173 55 California California NNP 3352 173 56 ” " '' 3352 173 57 and and CC 3352 173 58 , , , 3352 173 59 if if IN 3352 173 60 the the DT 3352 173 61 plurals plural NNS 3352 173 62 of of IN 3352 173 63 nouns noun NNS 3352 173 64 have have VBP 3352 173 65 been be VBN 3352 173 66 normalized normalize VBN 3352 173 67 , , , 3352 173 68 to to IN 3352 173 69 the the DT 3352 173 70 singular singular JJ 3352 173 71 form form NN 3352 173 72 , , , 3352 173 73 “ " `` 3352 173 74 library library NN 3352 173 75 . . . 3352 173 76 ” " '' 3352 173 77 Table table NN 3352 173 78 column column NN 3352 173 79 headings heading NNS 3352 173 80 and and CC 3352 173 81 row row VB 3352 173 82 headings heading NNS 3352 173 83 provide provide VBP 3352 173 84 additional additional JJ 3352 173 85 information information NN 3352 173 86 about about IN 3352 173 87 the the DT 3352 173 88 content content NN 3352 173 89 of of IN 3352 173 90 a a DT 3352 173 91 numeric numeric JJ 3352 173 92 table table NN 3352 173 93 . . . 3352 174 1 However however RB 3352 174 2 , , , 3352 174 3 the the DT 3352 174 4 column column NN 3352 174 5 and and CC 3352 174 6 row row NN 3352 174 7 headings heading NNS 3352 174 8 are be VBP 3352 174 9 usu- usu- RB 3352 174 10 ally ally NN 3352 174 11 not not RB 3352 174 12 directly directly RB 3352 174 13 searchable searchable JJ 3352 174 14 . . . 3352 175 1 For for IN 3352 175 2 example example NN 3352 175 3 , , , 3352 175 4 a a DT 3352 175 5 table table NN 3352 175 6 named name VBN 3352 175 7 “ " `` 3352 175 8 Language language NN 3352 175 9 spoken speak VBN 3352 175 10 at at IN 3352 175 11 home home NN 3352 175 12 ” " '' 3352 175 13 in in IN 3352 175 14 Counting Counting NNP 3352 175 15 California California NNP 3352 175 16 databases database VBZ 3352 175 17 consists consist VBZ 3352 175 18 of of IN 3352 175 19 rows row NNS 3352 175 20 and and CC 3352 175 21 columns column NNS 3352 175 22 . . . 3352 176 1 The the DT 3352 176 2 column column NN 3352 176 3 headings heading NNS 3352 176 4 list list VBP 3352 176 5 the the DT 3352 176 6 languages language NNS 3352 176 7 spoken speak VBN 3352 176 8 at at IN 3352 176 9 home home NN 3352 176 10 , , , 3352 176 11 while while IN 3352 176 12 the the DT 3352 176 13 row row NN 3352 176 14 headings heading NNS 3352 176 15 show show VBP 3352 176 16 the the DT 3352 176 17 county county NN 3352 176 18 names name NNS 3352 176 19 in in IN 3352 176 20 California California NNP 3352 176 21 . . . 3352 177 1 Each each DT 3352 177 2 cell cell NN 3352 177 3 in in IN 3352 177 4 the the DT 3352 177 5 table table NN 3352 177 6 gives give VBZ 3352 177 7 the the DT 3352 177 8 number number NN 3352 177 9 of of IN 3352 177 10 people people NNS 3352 177 11 , , , 3352 177 12 five five CD 3352 177 13 years year NNS 3352 177 14 of of IN 3352 177 15 age age NN 3352 177 16 and and CC 3352 177 17 older old JJR 3352 177 18 , , , 3352 177 19 who who WP 3352 177 20 speak speak VBP 3352 177 21 a a DT 3352 177 22 specific specific JJ 3352 177 23 language language NN 3352 177 24 at at IN 3352 177 25 home home NN 3352 177 26 . . . 3352 178 1 To to TO 3352 178 2 answer answer VB 3352 178 3 questions question NNS 3352 178 4 such such JJ 3352 178 5 as as IN 3352 178 6 “ " `` 3352 178 7 How how WRB 3352 178 8 many many JJ 3352 178 9 people people NNS 3352 178 10 speak speak VBP 3352 178 11 Spanish Spanish NNP 3352 178 12 at at IN 3352 178 13 home home NN 3352 178 14 in in IN 3352 178 15 Alameda Alameda NNP 3352 178 16 County County NNP 3352 178 17 , , , 3352 178 18 California California NNP 3352 178 19 ? ? . 3352 178 20 ” " '' 3352 178 21 using use VBG 3352 178 22 the the DT 3352 178 23 table table NN 3352 178 24 title title NN 3352 178 25 alone alone RB 3352 178 26 may may MD 3352 178 27 not not RB 3352 178 28 retrieve retrieve VB 3352 178 29 the the DT 3352 178 30 table table NN 3352 178 31 that that WDT 3352 178 32 contains contain VBZ 3352 178 33 the the DT 3352 178 34 answer answer NN 3352 178 35 to to IN 3352 178 36 the the DT 3352 178 37 example example NN 3352 178 38 question question NN 3352 178 39 . . . 3352 179 1 It -PRON- PRP 3352 179 2 is be VBZ 3352 179 3 recommended recommend VBN 3352 179 4 that that IN 3352 179 5 the the DT 3352 179 6 textual textual JJ 3352 179 7 descriptions description NNS 3352 179 8 of of IN 3352 179 9 numeric numeric JJ 3352 179 10 tables table NNS 3352 179 11 be be VB 3352 179 12 enriched enrich VBN 3352 179 13 . . . 3352 180 1 Automatically automatically RB 3352 180 2 combining combine VBG 3352 180 3 the the DT 3352 180 4 table table NN 3352 180 5 title title NN 3352 180 6 and and CC 3352 180 7 its -PRON- PRP$ 3352 180 8 column column NN 3352 180 9 and and CC 3352 180 10 row row NN 3352 180 11 headings heading NNS 3352 180 12 would would MD 3352 180 13 be be VB 3352 180 14 a a DT 3352 180 15 small small JJ 3352 180 16 but but CC 3352 180 17 practical practical JJ 3352 180 18 step step NN 3352 180 19 toward toward IN 3352 180 20 improved improve VBN 3352 180 21 retrieval retrieval NN 3352 180 22 . . . 3352 181 1 Geographic geographic JJ 3352 181 2 search search NN 3352 181 3 Socioeconomic socioeconomic JJ 3352 181 4 numeric numeric NNP 3352 181 5 data data NNP 3352 181 6 series series NN 3352 181 7 refer refer NN 3352 181 8 to to IN 3352 181 9 particular particular JJ 3352 181 10 areas area NNS 3352 181 11 and and CC 3352 181 12 , , , 3352 181 13 in in IN 3352 181 14 contrast contrast NN 3352 181 15 to to IN 3352 181 16 text text NN 3352 181 17 searching searching NN 3352 181 18 , , , 3352 181 19 the the DT 3352 181 20 geographical geographical JJ 3352 181 21 aspect aspect NN 3352 181 22 ordinarily ordinarily RB 3352 181 23 has have VBZ 3352 181 24 to to TO 3352 181 25 be be VB 3352 181 26 specified specify VBN 3352 181 27 . . . 3352 182 1 To to TO 3352 182 2 match match VB 3352 182 3 the the DT 3352 182 4 geographical geographical JJ 3352 182 5 area area NN 3352 182 6 of of IN 3352 182 7 the the DT 3352 182 8 numeric numeric JJ 3352 182 9 data datum NNS 3352 182 10 , , , 3352 182 11 a a DT 3352 182 12 matching match VBG 3352 182 13 text text NN 3352 182 14 search search NN 3352 182 15 may may MD 3352 182 16 also also RB 3352 182 17 have have VB 3352 182 18 to to TO 3352 182 19 specify specify VB 3352 182 20 the the DT 3352 182 21 same same JJ 3352 182 22 place place NN 3352 182 23 . . . 3352 183 1 The the DT 3352 183 2 authors author NNS 3352 183 3 found find VBD 3352 183 4 that that IN 3352 183 5 this this DT 3352 183 6 was be VBD 3352 183 7 hard hard JJ 3352 183 8 to to TO 3352 183 9 achieve achieve VB 3352 183 10 for for IN 3352 183 11 several several JJ 3352 183 12 reasons reason NNS 3352 183 13 . . . 3352 184 1 Place place NN 3352 184 2 names name NNS 3352 184 3 are be VBP 3352 184 4 ambiguous ambiguous JJ 3352 184 5 and and CC 3352 184 6 unstable unstable JJ 3352 184 7 : : : 3352 184 8 A a DT 3352 184 9 search search NN 3352 184 10 for for IN 3352 184 11 data datum NNS 3352 184 12 relating relate VBG 3352 184 13 to to IN 3352 184 14 Trinidad Trinidad NNP 3352 184 15 might may MD 3352 184 16 lead lead VB 3352 184 17 to to IN 3352 184 18 Trinidad Trinidad NNP 3352 184 19 , , , 3352 184 20 West West NNP 3352 184 21 Indies Indies NNP 3352 184 22 , , , 3352 184 23 instead instead RB 3352 184 24 of of IN 3352 184 25 Trinidad Trinidad NNP 3352 184 26 , , , 3352 184 27 California California NNP 3352 184 28 , , , 3352 184 29 for for IN 3352 184 30 example example NN 3352 184 31 . . . 3352 185 1 The the DT 3352 185 2 problem problem NN 3352 185 3 is be VBZ 3352 185 4 compounded compound VBN 3352 185 5 because because IN 3352 185 6 , , , 3352 185 7 in in IN 3352 185 8 numeric numeric NNP 3352 185 9 data data NNP 3352 185 10 series series NNP 3352 185 11 , , , 3352 185 12 specialized specialized JJ 3352 185 13 geopolitical geopolitical JJ 3352 185 14 divisions division NNS 3352 185 15 , , , 3352 185 16 such such JJ 3352 185 17 as as IN 3352 185 18 census census NN 3352 185 19 tracts tract NNS 3352 185 20 and and CC 3352 185 21 counties county NNS 3352 185 22 , , , 3352 185 23 are be VBP 3352 185 24 commonly commonly RB 3352 185 25 used use VBN 3352 185 26 . . . 3352 186 1 These these DT 3352 186 2 divisions division NNS 3352 186 3 do do VBP 3352 186 4 not not RB 3352 186 5 match match VB 3352 186 6 conve- conve- VB 3352 186 7 niently niently RB 3352 186 8 with with IN 3352 186 9 searchers searcher NNS 3352 186 10 ’ ’ POS 3352 186 11 ordinary ordinary JJ 3352 186 12 use use NN 3352 186 13 of of IN 3352 186 14 place place NN 3352 186 15 names name NNS 3352 186 16 . . . 3352 187 1 Also also RB 3352 187 2 , , , 3352 187 3 the the DT 3352 187 4 granularity granularity NN 3352 187 5 of of IN 3352 187 6 geographical geographical JJ 3352 187 7 coverage coverage NN 3352 187 8 may may MD 3352 187 9 not not RB 3352 187 10 match match VB 3352 187 11 well well RB 3352 187 12 . . . 3352 188 1 Data datum NNS 3352 188 2 relating relate VBG 3352 188 3 to to IN 3352 188 4 Berkeley Berkeley NNP 3352 188 5 , , , 3352 188 6 for for IN 3352 188 7 example example NN 3352 188 8 , , , 3352 188 9 may may MD 3352 188 10 be be VB 3352 188 11 avail- avail- PRP$ 3352 188 12 able able JJ 3352 188 13 only only RB 3352 188 14 in in IN 3352 188 15 aggregated aggregate VBN 3352 188 16 data datum NNS 3352 188 17 for for IN 3352 188 18 Alameda Alameda NNP 3352 188 19 County County NNP 3352 188 20 . . . 3352 189 1 It -PRON- PRP 3352 189 2 was be VBD 3352 189 3 eventually eventually RB 3352 189 4 concluded conclude VBN 3352 189 5 that that IN 3352 189 6 reliance reliance NN 3352 189 7 on on IN 3352 189 8 the the DT 3352 189 9 names name NNS 3352 189 10 of of IN 3352 189 11 places place NNS 3352 189 12 could could MD 3352 189 13 never never RB 3352 189 14 work work VB 3352 189 15 satisfactorily satisfactorily RB 3352 189 16 . . . 3352 190 1 The the DT 3352 190 2 only only JJ 3352 190 3 effective effective JJ 3352 190 4 path path NN 3352 190 5 to to IN 3352 190 6 reliable reliable JJ 3352 190 7 access access NN 3352 190 8 to to IN 3352 190 9 data datum NNS 3352 190 10 relating relate VBG 3352 190 11 to to IN 3352 190 12 places place NNS 3352 190 13 would would MD 3352 190 14 be be VB 3352 190 15 to to TO 3352 190 16 use use VB 3352 190 17 geospatial geospatial JJ 3352 190 18 coordinates coordinate NNS 3352 190 19 ( ( -LRB- 3352 190 20 latitude latitude NN 3352 190 21 and and CC 3352 190 22 longitude longitude NN 3352 190 23 ) ) -RRB- 3352 190 24 to to TO 3352 190 25 establish establish VB 3352 190 26 unambiguously unambiguously RB 3352 190 27 the the DT 3352 190 28 identity identity NN 3352 190 29 and and CC 3352 190 30 location location NN 3352 190 31 of of IN 3352 190 32 any any DT 3352 190 33 place place NN 3352 190 34 and and CC 3352 190 35 the the DT 3352 190 36 relationship relationship NN 3352 190 37 between between IN 3352 190 38 places place NNS 3352 190 39 . . . 3352 191 1 This this DT 3352 191 2 means mean VBZ 3352 191 3 that that IN 3352 191 4 gazetteers gazetteer NNS 3352 191 5 and and CC 3352 191 6 map map VB 3352 191 7 visualizations visualization NNS 3352 191 8 become become VBP 3352 191 9 important important JJ 3352 191 10 . . . 3352 192 1 Gazetteers gazetteer NNS 3352 192 2 relate relate VBP 3352 192 3 named name VBN 3352 192 4 places place NNS 3352 192 5 to to IN 3352 192 6 defined define VBN 3352 192 7 spaces space NNS 3352 192 8 , , , 3352 192 9 and and CC 3352 192 10 thereby thereby RB 3352 192 11 reveal reveal VBP 3352 192 12 spatial spatial JJ 3352 192 13 relationships relationship NNS 3352 192 14 between between IN 3352 192 15 places place NNS 3352 192 16 , , , 3352 192 17 e.g. e.g. RB 3352 192 18 , , , 3352 192 19 the the DT 3352 192 20 city city NN 3352 192 21 of of IN 3352 192 22 Alameda Alameda NNP 3352 192 23 is be VBZ 3352 192 24 on on IN 3352 192 25 Alameda Alameda NNP 3352 192 26 Island Island NNP 3352 192 27 within within IN 3352 192 28 Alameda Alameda NNP 3352 192 29 County County NNP 3352 192 30 . . . 3352 193 1 This this DT 3352 193 2 problem problem NN 3352 193 3 has have VBZ 3352 193 4 been be VBN 3352 193 5 addressed address VBN 3352 193 6 in in IN 3352 193 7 a a DT 3352 193 8 subsequent subsequent JJ 3352 193 9 Figure figure NN 3352 193 10 4 4 CD 3352 193 11 . . . 3352 194 1 Architecture architecture NN 3352 194 2 of of IN 3352 194 3 the the DT 3352 194 4 prototype prototype NN 3352 194 5 SEARCH SEARCH NNS 3352 194 6 ACROSS acros VBN 3352 194 7 DIFFERENT DIFFERENT NNP 3352 194 8 MEDIA medium NNS 3352 194 9 | | CD 3352 194 10 BUCKLAND BUCKLAND NNP 3352 194 11 , , , 3352 194 12 CHEN CHEN NNP 3352 194 13 , , , 3352 194 14 GEY GEY NNP 3352 194 15 , , , 3352 194 16 AND and CC 3352 194 17 LARSON LARSON NNP 3352 194 18 187 187 CD 3352 194 19 study study NN 3352 194 20 entitled entitle VBN 3352 194 21 “ " `` 3352 194 22 Going Going NNP 3352 194 23 Places Places NNPS 3352 194 24 in in IN 3352 194 25 the the DT 3352 194 26 Catalog catalog NN 3352 194 27 : : : 3352 194 28 Improved Improved NNP 3352 194 29 Geographical Geographical NNP 3352 194 30 Access Access NNP 3352 194 31 . . . 3352 194 32 ”8 ”8 ADD 3352 194 33 Temporal temporal JJ 3352 194 34 search search NN 3352 194 35 Searches search NNS 3352 194 36 of of IN 3352 194 37 text text NN 3352 194 38 files file NNS 3352 194 39 and and CC 3352 194 40 of of IN 3352 194 41 socioeconomic socioeconomic JJ 3352 194 42 numeric numeric NNP 3352 194 43 data data NNP 3352 194 44 series series NNP 3352 194 45 also also RB 3352 194 46 differ differ VBP 3352 194 47 substantially substantially RB 3352 194 48 with with IN 3352 194 49 respect respect NN 3352 194 50 to to IN 3352 194 51 time time NN 3352 194 52 periods period NNS 3352 194 53 : : : 3352 194 54 Numeric numeric JJ 3352 194 55 data datum NNS 3352 194 56 searches search NNS 3352 194 57 ordinarily ordinarily RB 3352 194 58 require require VBP 3352 194 59 the the DT 3352 194 60 years year NNS 3352 194 61 of of IN 3352 194 62 inter- inter- NN 3352 194 63 est est NN 3352 194 64 to to TO 3352 194 65 be be VB 3352 194 66 specified specify VBN 3352 194 67 ; ; : 3352 194 68 text text NN 3352 194 69 searches search NNS 3352 194 70 rarely rarely RB 3352 194 71 specify specify VBP 3352 194 72 the the DT 3352 194 73 period period NN 3352 194 74 . . . 3352 195 1 An an DT 3352 195 2 additional additional JJ 3352 195 3 difficulty difficulty NN 3352 195 4 arises arise VBZ 3352 195 5 because because IN 3352 195 6 in in IN 3352 195 7 text text NN 3352 195 8 , , , 3352 195 9 as as IN 3352 195 10 in in IN 3352 195 11 speech speech NN 3352 195 12 , , , 3352 195 13 a a DT 3352 195 14 period period NN 3352 195 15 is be VBZ 3352 195 16 commonly commonly RB 3352 195 17 referred refer VBN 3352 195 18 to to IN 3352 195 19 by by IN 3352 195 20 a a DT 3352 195 21 name name NN 3352 195 22 derived derive VBN 3352 195 23 meta- meta- JJ 3352 195 24 phorically phorically RB 3352 195 25 from from IN 3352 195 26 events event NNS 3352 195 27 used use VBN 3352 195 28 as as IN 3352 195 29 temporal temporal JJ 3352 195 30 markers marker NNS 3352 195 31 , , , 3352 195 32 rather rather RB 3352 195 33 than than IN 3352 195 34 by by IN 3352 195 35 calendar calendar NN 3352 195 36 time time NN 3352 195 37 , , , 3352 195 38 as as IN 3352 195 39 in in IN 3352 195 40 “ " `` 3352 195 41 during during IN 3352 195 42 Vietnam Vietnam NNP 3352 195 43 , , , 3352 195 44 ” " '' 3352 195 45 “ " `` 3352 195 46 under under IN 3352 195 47 Clinton Clinton NNP 3352 195 48 , , , 3352 195 49 ” " '' 3352 195 50 or or CC 3352 195 51 “ " `` 3352 195 52 in in IN 3352 195 53 the the DT 3352 195 54 reign reign NN 3352 195 55 of of IN 3352 195 56 Henry Henry NNP 3352 195 57 VIII VIII NNP 3352 195 58 . . . 3352 195 59 ” " '' 3352 195 60 Named name VBN 3352 195 61 time time NN 3352 195 62 periods period NNS 3352 195 63 have have VBP 3352 195 64 some some DT 3352 195 65 of of IN 3352 195 66 the the DT 3352 195 67 characteristics characteristic NNS 3352 195 68 of of IN 3352 195 69 place place NN 3352 195 70 names name NNS 3352 195 71 : : : 3352 195 72 they -PRON- PRP 3352 195 73 are be VBP 3352 195 74 culturally culturally RB 3352 195 75 based base VBN 3352 195 76 and and CC 3352 195 77 tend tend VB 3352 195 78 to to TO 3352 195 79 be be VB 3352 195 80 multiple multiple JJ 3352 195 81 , , , 3352 195 82 unstable unstable JJ 3352 195 83 , , , 3352 195 84 and and CC 3352 195 85 ambiguous ambiguous JJ 3352 195 86 . . . 3352 196 1 It -PRON- PRP 3352 196 2 appears appear VBZ 3352 196 3 that that IN 3352 196 4 an an DT 3352 196 5 analogous analogous JJ 3352 196 6 solution solution NN 3352 196 7 is be VBZ 3352 196 8 indicated indicate VBN 3352 196 9 : : : 3352 196 10 directories directory NNS 3352 196 11 of of IN 3352 196 12 named name VBN 3352 196 13 time time NN 3352 196 14 periods period NNS 3352 196 15 mapped map VBN 3352 196 16 to to TO 3352 196 17 calendar calendar VB 3352 196 18 definitions definition NNS 3352 196 19 , , , 3352 196 20 much much RB 3352 196 21 as as IN 3352 196 22 a a DT 3352 196 23 gazet- gazet- NN 3352 196 24 teer teer NN 3352 196 25 links link NNS 3352 196 26 place place NN 3352 196 27 names name NNS 3352 196 28 to to IN 3352 196 29 spatial spatial JJ 3352 196 30 locators locator NNS 3352 196 31 . . . 3352 197 1 This this DT 3352 197 2 problem problem NN 3352 197 3 is be VBZ 3352 197 4 being be VBG 3352 197 5 addressed address VBN 3352 197 6 in in IN 3352 197 7 a a DT 3352 197 8 subsequent subsequent JJ 3352 197 9 study study NN 3352 197 10 entitled entitle VBN 3352 197 11 “ " `` 3352 197 12 Support support NN 3352 197 13 for for IN 3352 197 14 the the DT 3352 197 15 Learner Learner NNP 3352 197 16 : : : 3352 197 17 What what WP 3352 197 18 , , , 3352 197 19 Where where WRB 3352 197 20 , , , 3352 197 21 When when WRB 3352 197 22 , , , 3352 197 23 and and CC 3352 197 24 Who who WP 3352 197 25 . . . 3352 197 26 ”9 ”9 -LRB- 3352 197 27 Media Media NNP 3352 197 28 forms form VBZ 3352 197 29 The the DT 3352 197 30 paradox paradox NN 3352 197 31 , , , 3352 197 32 in in IN 3352 197 33 an an DT 3352 197 34 environment environment NN 3352 197 35 of of IN 3352 197 36 digital digital NNP 3352 197 37 “ " `` 3352 197 38 media medium NNS 3352 197 39 conver- conver- NNP 3352 197 40 gence gence NNP 3352 197 41 , , , 3352 197 42 ” " '' 3352 197 43 that that IN 3352 197 44 it -PRON- PRP 3352 197 45 appears appear VBZ 3352 197 46 impossible impossible JJ 3352 197 47 to to TO 3352 197 48 search search VB 3352 197 49 directly directly RB 3352 197 50 across across IN 3352 197 51 different different JJ 3352 197 52 media medium NNS 3352 197 53 forms form NNS 3352 197 54 invites invite VBZ 3352 197 55 closer close JJR 3352 197 56 attention attention NN 3352 197 57 to to IN 3352 197 58 concepts concept NNS 3352 197 59 and and CC 3352 197 60 terminology terminology NN 3352 197 61 associated associate VBN 3352 197 62 with with IN 3352 197 63 media medium NNS 3352 197 64 . . . 3352 198 1 A a DT 3352 198 2 view view NN 3352 198 3 that that IN 3352 198 4 fits fit VBZ 3352 198 5 and and CC 3352 198 6 explains explain VBZ 3352 198 7 the the DT 3352 198 8 phenomena phenomenon NNS 3352 198 9 as as IN 3352 198 10 the the DT 3352 198 11 authors author NNS 3352 198 12 understand understand VBP 3352 198 13 them -PRON- PRP 3352 198 14 , , , 3352 198 15 distinguishes distinguish VBZ 3352 198 16 three three CD 3352 198 17 aspects aspect NNS 3352 198 18 of of IN 3352 198 19 media medium NNS 3352 198 20 : : : 3352 198 21 ■ ■ NFP 3352 198 22 Cultural cultural JJ 3352 198 23 codes code NNS 3352 198 24 : : : 3352 198 25 All all DT 3352 198 26 forms form NNS 3352 198 27 of of IN 3352 198 28 expression expression NN 3352 198 29 depend depend VBP 3352 198 30 on on IN 3352 198 31 some some DT 3352 198 32 shared share VBN 3352 198 33 understandings understanding NNS 3352 198 34 , , , 3352 198 35 on on IN 3352 198 36 language language NN 3352 198 37 in in IN 3352 198 38 a a DT 3352 198 39 broad broad JJ 3352 198 40 sense sense NN 3352 198 41 . . . 3352 199 1 Convergence Convergence NNP 3352 199 2 here here RB 3352 199 3 means mean VBZ 3352 199 4 cultural cultural JJ 3352 199 5 convergence convergence NN 3352 199 6 or or CC 3352 199 7 interpretation interpretation NN 3352 199 8 . . . 3352 200 1 ■ ■ NFP 3352 200 2 Media medium NNS 3352 200 3 types type NNS 3352 200 4 : : : 3352 200 5 Different different JJ 3352 200 6 types type NNS 3352 200 7 of of IN 3352 200 8 expression expression NN 3352 200 9 have have VBP 3352 200 10 evolved evolve VBN 3352 200 11 : : : 3352 200 12 Texts text NNS 3352 200 13 , , , 3352 200 14 images image NNS 3352 200 15 , , , 3352 200 16 numbers number NNS 3352 200 17 , , , 3352 200 18 diagrams diagram NNS 3352 200 19 , , , 3352 200 20 art art NN 3352 200 21 . . . 3352 201 1 An an DT 3352 201 2 initial initial JJ 3352 201 3 classification classification NN 3352 201 4 can can MD 3352 201 5 well well RB 3352 201 6 start start VB 3352 201 7 with with IN 3352 201 8 the the DT 3352 201 9 five five CD 3352 201 10 senses sense NNS 3352 201 11 of of IN 3352 201 12 sight sight NN 3352 201 13 , , , 3352 201 14 smell smell NN 3352 201 15 , , , 3352 201 16 hearing hearing NN 3352 201 17 , , , 3352 201 18 taste taste NN 3352 201 19 , , , 3352 201 20 and and CC 3352 201 21 feel feel VBP 3352 201 22 . . . 3352 202 1 ■ ■ NFP 3352 202 2 Physical physical JJ 3352 202 3 media medium NNS 3352 202 4 : : : 3352 202 5 Paper paper NN 3352 202 6 ; ; : 3352 202 7 film film NN 3352 202 8 ; ; , 3352 202 9 analog analog JJ 3352 202 10 magnetic magnetic JJ 3352 202 11 tape tape NN 3352 202 12 ; ; : 3352 202 13 bits bit NNS 3352 202 14 ; ; : 3352 202 15 . . . 3352 203 1 . . . 3352 204 1 . . . 3352 205 1 Being be VBG 3352 205 2 digital digital JJ 3352 205 3 affects affect NNS 3352 205 4 directly directly RB 3352 205 5 only only RB 3352 205 6 this this DT 3352 205 7 aspect aspect NN 3352 205 8 . . . 3352 206 1 Anything anything NN 3352 206 2 perceived perceive VBN 3352 206 3 as as IN 3352 206 4 a a DT 3352 206 5 meaningful meaningful JJ 3352 206 6 document document NN 3352 206 7 has have VBZ 3352 206 8 cul- cul- RB 3352 206 9 tural tural JJ 3352 206 10 , , , 3352 206 11 type type NN 3352 206 12 , , , 3352 206 13 and and CC 3352 206 14 physical physical JJ 3352 206 15 aspects aspect NNS 3352 206 16 , , , 3352 206 17 and and CC 3352 206 18 genre genre VBN 3352 206 19 usefully usefully RB 3352 206 20 denotes denote VBZ 3352 206 21 specific specific JJ 3352 206 22 combinations combination NNS 3352 206 23 of of IN 3352 206 24 code code NN 3352 206 25 , , , 3352 206 26 type type NN 3352 206 27 , , , 3352 206 28 and and CC 3352 206 29 physical physical JJ 3352 206 30 medium medium NN 3352 206 31 adopted adopt VBN 3352 206 32 by by IN 3352 206 33 social social JJ 3352 206 34 convention convention NN 3352 206 35 . . . 3352 207 1 Genres genre NNS 3352 207 2 are be VBP 3352 207 3 historically historically RB 3352 207 4 and and CC 3352 207 5 culturally culturally RB 3352 207 6 situated situate VBN 3352 207 7 . . . 3352 208 1 Convergence convergence NN 3352 208 2 can can MD 3352 208 3 be be VB 3352 208 4 understood understand VBN 3352 208 5 in in IN 3352 208 6 terms term NNS 3352 208 7 of of IN 3352 208 8 interoper- interoper- NN 3352 208 9 ability ability NN 3352 208 10 and and CC 3352 208 11 is be VBZ 3352 208 12 clearly clearly RB 3352 208 13 seen see VBN 3352 208 14 in in IN 3352 208 15 physical physical JJ 3352 208 16 media medium NNS 3352 208 17 technology technology NN 3352 208 18 . . . 3352 209 1 The the DT 3352 209 2 adoption adoption NN 3352 209 3 of of IN 3352 209 4 English English NNP 3352 209 5 as as IN 3352 209 6 a a DT 3352 209 7 language language NN 3352 209 8 for for IN 3352 209 9 international international JJ 3352 209 10 use use NN 3352 209 11 in in IN 3352 209 12 an an DT 3352 209 13 increasingly increasingly RB 3352 209 14 global global JJ 3352 209 15 community community NN 3352 209 16 promotes promote VBZ 3352 209 17 conver- conver- NNP 3352 209 18 gence gence NNP 3352 209 19 in in IN 3352 209 20 cultural cultural JJ 3352 209 21 codes code NNS 3352 209 22 . . . 3352 210 1 Nevertheless nevertheless RB 3352 210 2 , , , 3352 210 3 the the DT 3352 210 4 different different JJ 3352 210 5 media medium NNS 3352 210 6 types type NNS 3352 210 7 are be VBP 3352 210 8 fundamentally fundamentally RB 3352 210 9 distinct distinct JJ 3352 210 10 . . . 3352 211 1 Metadata Metadata NNP 3352 211 2 as as IN 3352 211 3 infrastructure infrastructure NN 3352 211 4 It -PRON- PRP 3352 211 5 is be VBZ 3352 211 6 the the DT 3352 211 7 metadata metadata NN 3352 211 8 and and CC 3352 211 9 , , , 3352 211 10 in in IN 3352 211 11 a a DT 3352 211 12 very very RB 3352 211 13 broad broad JJ 3352 211 14 sense sense NN 3352 211 15 , , , 3352 211 16 “ " `` 3352 211 17 biblio- biblio- NNP 3352 211 18 graphic graphic NNP 3352 211 19 ” " '' 3352 211 20 tools tool NNS 3352 211 21 that that WDT 3352 211 22 provide provide VBP 3352 211 23 the the DT 3352 211 24 infrastructure infrastructure NN 3352 211 25 necessary necessary JJ 3352 211 26 for for IN 3352 211 27 searches search NNS 3352 211 28 across across RB 3352 211 29 and and CC 3352 211 30 between between IN 3352 211 31 different different JJ 3352 211 32 media medium NNS 3352 211 33 — — : 3352 211 34 thesauruses thesaurus NNS 3352 211 35 , , , 3352 211 36 mappings mapping NNS 3352 211 37 between between IN 3352 211 38 vocabularies vocabulary NNS 3352 211 39 , , , 3352 211 40 place place NN 3352 211 41 - - HYPH 3352 211 42 name name NN 3352 211 43 gazetteers gazetteer NNS 3352 211 44 , , , 3352 211 45 and and CC 3352 211 46 the the DT 3352 211 47 like like JJ 3352 211 48 . . . 3352 212 1 In in IN 3352 212 2 isolation isolation NN 3352 212 3 , , , 3352 212 4 metadata metadata NN 3352 212 5 is be VBZ 3352 212 6 properly properly RB 3352 212 7 regarded regard VBN 3352 212 8 as as IN 3352 212 9 description description NN 3352 212 10 attached attach VBN 3352 212 11 to to IN 3352 212 12 documents document NNS 3352 212 13 , , , 3352 212 14 but but CC 3352 212 15 this this DT 3352 212 16 is be VBZ 3352 212 17 too too RB 3352 212 18 narrow narrow JJ 3352 212 19 a a DT 3352 212 20 view view NN 3352 212 21 . . . 3352 213 1 Collectively collectively RB 3352 213 2 , , , 3352 213 3 the the DT 3352 213 4 metadata metadata NN 3352 213 5 forms form VBZ 3352 213 6 the the DT 3352 213 7 infrastructure infrastructure NN 3352 213 8 through through IN 3352 213 9 which which WDT 3352 213 10 different different JJ 3352 213 11 documents document NNS 3352 213 12 can can MD 3352 213 13 be be VB 3352 213 14 related relate VBN 3352 213 15 to to IN 3352 213 16 each each DT 3352 213 17 other other JJ 3352 213 18 . . . 3352 214 1 It -PRON- PRP 3352 214 2 is be VBZ 3352 214 3 a a DT 3352 214 4 variation variation NN 3352 214 5 on on IN 3352 214 6 the the DT 3352 214 7 role role NN 3352 214 8 of of IN 3352 214 9 citations citation NNS 3352 214 10 : : : 3352 214 11 Individually individually RB 3352 214 12 , , , 3352 214 13 references reference NNS 3352 214 14 amplify amplify VBP 3352 214 15 an an DT 3352 214 16 individual individual JJ 3352 214 17 document document NN 3352 214 18 by by IN 3352 214 19 validating validate VBG 3352 214 20 statements statement NNS 3352 214 21 made make VBN 3352 214 22 within within IN 3352 214 23 it -PRON- PRP 3352 214 24 ; ; : 3352 214 25 collectively collectively RB 3352 214 26 , , , 3352 214 27 as as IN 3352 214 28 a a DT 3352 214 29 citation citation NN 3352 214 30 index index NN 3352 214 31 , , , 3352 214 32 references reference NNS 3352 214 33 show show VBP 3352 214 34 the the DT 3352 214 35 structure structure NN 3352 214 36 of of IN 3352 214 37 scholarship scholarship NN 3352 214 38 to to TO 3352 214 39 which which WDT 3352 214 40 docu- docu- NN 3352 214 41 ments ment NNS 3352 214 42 are be VBP 3352 214 43 attached attach VBN 3352 214 44 . . . 3352 215 1 ■ ■ NFP 3352 215 2 Summary Summary NNP 3352 215 3 A a DT 3352 215 4 project project NN 3352 215 5 was be VBD 3352 215 6 undertaken undertake VBN 3352 215 7 to to TO 3352 215 8 demonstrate demonstrate VB 3352 215 9 simultane- simultane- JJ 3352 215 10 ous ous JJ 3352 215 11 search search NN 3352 215 12 of of IN 3352 215 13 two two CD 3352 215 14 different different JJ 3352 215 15 media medium NNS 3352 215 16 types type NNS 3352 215 17 ( ( -LRB- 3352 215 18 socioeconomic socioeconomic JJ 3352 215 19 numeric numeric NNP 3352 215 20 data data NNP 3352 215 21 series series NN 3352 215 22 and and CC 3352 215 23 text text NN 3352 215 24 files file NNS 3352 215 25 ) ) -RRB- 3352 215 26 without without IN 3352 215 27 ingesting ingest VBG 3352 215 28 these these DT 3352 215 29 diverse diverse JJ 3352 215 30 resources resource NNS 3352 215 31 into into IN 3352 215 32 a a DT 3352 215 33 shared shared JJ 3352 215 34 environment environment NN 3352 215 35 . . . 3352 216 1 The the DT 3352 216 2 project project NN 3352 216 3 objective objective NN 3352 216 4 was be VBD 3352 216 5 eventually eventually RB 3352 216 6 achieved achieve VBN 3352 216 7 , , , 3352 216 8 but but CC 3352 216 9 proved prove VBD 3352 216 10 harder hard RBR 3352 216 11 than than IN 3352 216 12 expected expect VBN 3352 216 13 for for IN 3352 216 14 the the DT 3352 216 15 following follow VBG 3352 216 16 reasons reason NNS 3352 216 17 : : : 3352 216 18 Access access NN 3352 216 19 to to IN 3352 216 20 these these DT 3352 216 21 differ- differ- NN 3352 216 22 ent ent NN 3352 216 23 media medium NNS 3352 216 24 types type NNS 3352 216 25 has have VBZ 3352 216 26 been be VBN 3352 216 27 developed develop VBN 3352 216 28 by by IN 3352 216 29 different different JJ 3352 216 30 commu- commu- FW 3352 216 31 nities nitie NNS 3352 216 32 with with IN 3352 216 33 different different JJ 3352 216 34 practices practice NNS 3352 216 35 ; ; : 3352 216 36 the the DT 3352 216 37 systems system NNS 3352 216 38 ( ( -LRB- 3352 216 39 vocabularies vocabulary NNS 3352 216 40 ) ) -RRB- 3352 216 41 for for IN 3352 216 42 topical topical JJ 3352 216 43 categorization categorization NN 3352 216 44 vary vary VBP 3352 216 45 greatly greatly RB 3352 216 46 and and CC 3352 216 47 need need VBP 3352 216 48 interpre- interpre- NN 3352 216 49 tative tative NN 3352 216 50 mappings mapping NNS 3352 216 51 ( ( -LRB- 3352 216 52 also also RB 3352 216 53 known know VBN 3352 216 54 as as IN 3352 216 55 relative relative JJ 3352 216 56 indexes index NNS 3352 216 57 , , , 3352 216 58 search- search- JJ 3352 216 59 term term NN 3352 216 60 recommender recommender NN 3352 216 61 systems system NNS 3352 216 62 , , , 3352 216 63 and and CC 3352 216 64 EVIs evi NNS 3352 216 65 ) ) -RRB- 3352 216 66 ; ; : 3352 216 67 specification specification NN 3352 216 68 of of IN 3352 216 69 geographical geographical JJ 3352 216 70 area area NN 3352 216 71 and and CC 3352 216 72 time time NN 3352 216 73 period period NN 3352 216 74 are be VBP 3352 216 75 as as RB 3352 216 76 necessary necessary JJ 3352 216 77 for for IN 3352 216 78 search search NN 3352 216 79 in in IN 3352 216 80 socioeconomic socioeconomic NNP 3352 216 81 data data NNP 3352 216 82 series series NNP 3352 216 83 and and CC 3352 216 84 , , , 3352 216 85 for for IN 3352 216 86 this this DT 3352 216 87 , , , 3352 216 88 existing exist VBG 3352 216 89 procedures procedure NNS 3352 216 90 for for IN 3352 216 91 searching search VBG 3352 216 92 text text NN 3352 216 93 files file NNS 3352 216 94 are be VBP 3352 216 95 inadequate inadequate JJ 3352 216 96 . . . 3352 217 1 ■ ■ NFP 3352 217 2 Acknowledgement Acknowledgement NNP 3352 217 3 This this DT 3352 217 4 work work NN 3352 217 5 was be VBD 3352 217 6 partially partially RB 3352 217 7 supported support VBN 3352 217 8 by by IN 3352 217 9 the the DT 3352 217 10 Institute Institute NNP 3352 217 11 of of IN 3352 217 12 Museum Museum NNP 3352 217 13 and and CC 3352 217 14 Library Library NNP 3352 217 15 Services Services NNPS 3352 217 16 through through IN 3352 217 17 National National NNP 3352 217 18 Library Library NNP 3352 217 19 Leadership Leadership NNP 3352 217 20 Grant Grant NNP 3352 217 21 No No NNP 3352 217 22 . . . 3352 218 1 178 178 CD 3352 218 2 for for IN 3352 218 3 a a DT 3352 218 4 project project NN 3352 218 5 entitled entitle VBN 3352 218 6 “ " `` 3352 218 7 Seamless Seamless NNP 3352 218 8 Searching Searching NNP 3352 218 9 of of IN 3352 218 10 Numeric Numeric NNP 3352 218 11 and and CC 3352 218 12 Textual Textual NNP 3352 218 13 Resources Resources NNP 3352 218 14 , , , 3352 218 15 ” " '' 3352 218 16 and and CC 3352 218 17 was be VBD 3352 218 18 based base VBN 3352 218 19 on on IN 3352 218 20 prior prior JJ 3352 218 21 research research NN 3352 218 22 partially partially RB 3352 218 23 supported support VBN 3352 218 24 by by IN 3352 218 25 DARPA DARPA NNP 3352 218 26 Contracts Contracts NNP 3352 218 27 N66001 N66001 NNP 3352 218 28 - - HYPH 3352 218 29 97-C-8541 97-C-8541 NNP 3352 218 30 ; ; : 3352 218 31 AO AO NNP 3352 218 32 # # NNP 3352 218 33 F477 F477 NNP 3352 218 34 : : : 3352 218 35 “ " `` 3352 218 36 Search Search NNP 3352 218 37 Support support NN 3352 218 38 for for IN 3352 218 39 Unfamiliar Unfamiliar NNP 3352 218 40 Metadata Metadata NNP 3352 218 41 Vocabularies Vocabularies NNPS 3352 218 42 ” " '' 3352 218 43 and and CC 3352 218 44 N66001 N66001 NNP 3352 218 45 - - HYPH 3352 218 46 00 00 CD 3352 218 47 - - HYPH 3352 218 48 1- 1- CD 3352 218 49 8911 8911 CD 3352 218 50 , , , 3352 218 51 TO to IN 3352 218 52 # # NN 3352 218 53 J290 J290 NNP 3352 218 54 : : : 3352 218 55 “ " `` 3352 218 56 Translingual Translingual NNP 3352 218 57 Information Information NNP 3352 218 58 Management Management NNP 3352 218 59 Using Using NNP 3352 218 60 Domain Domain NNP 3352 218 61 Ontologies Ontologies NNPS 3352 218 62 . . . 3352 218 63 ” " '' 3352 218 64 References reference NNS 3352 218 65 1 1 CD 3352 218 66 . . . 3352 219 1 Michael Michael NNP 3352 219 2 K. K. NNP 3352 219 3 Buckland Buckland NNP 3352 219 4 , , , 3352 219 5 Fredric Fredric NNP 3352 219 6 C. C. NNP 3352 219 7 Gey Gey NNP 3352 219 8 , , , 3352 219 9 and and CC 3352 219 10 Ray Ray NNP 3352 219 11 R. R. NNP 3352 219 12 Larson Larson NNP 3352 219 13 , , , 3352 219 14 Seamless Seamless NNP 3352 219 15 Searching Searching NNP 3352 219 16 of of IN 3352 219 17 Numeric Numeric NNP 3352 219 18 and and CC 3352 219 19 Textual Textual NNP 3352 219 20 Resources Resources NNPS 3352 219 21 : : : 3352 219 22 Final Final NNP 3352 219 23 Report Report NNP 3352 219 24 on on IN 3352 219 25 Institute Institute NNP 3352 219 26 of of IN 3352 219 27 Museum Museum NNP 3352 219 28 and and CC 3352 219 29 Library Library NNP 3352 219 30 Services Services NNPS 3352 219 31 National National NNP 3352 219 32 Leadership Leadership NNP 3352 219 33 188 188 CD 3352 219 34 INFORMATION INFORMATION NNP 3352 219 35 TECHNOLOGY technology NN 3352 219 36 AND and CC 3352 219 37 LIBRARIES LIBRARIES NNP 3352 219 38 | | CD 3352 219 39 DECEMBER DECEMBER NNP 3352 219 40 2006 2006 CD 3352 219 41 Grant Grant NNP 3352 219 42 No no NN 3352 219 43 . . . 3352 220 1 178 178 CD 3352 220 2 ( ( -LRB- 3352 220 3 Berkeley Berkeley NNP 3352 220 4 , , , 3352 220 5 Calif. California NNP 3352 220 6 : : : 3352 220 7 Univ Univ NNP 3352 220 8 . . . 3352 221 1 of of IN 3352 221 2 California California NNP 3352 221 3 , , , 3352 221 4 School School NNP 3352 221 5 of of IN 3352 221 6 Information Information NNP 3352 221 7 Management Management NNP 3352 221 8 and and CC 3352 221 9 Systems Systems NNPS 3352 221 10 , , , 3352 221 11 2002 2002 CD 3352 221 12 ) ) -RRB- 3352 221 13 , , , 3352 221 14 http:// http:// NNP 3352 221 15 metadata.sims.berkeley.edu/papers/SeamlessSearchFinal metadata.sims.berkeley.edu/papers/SeamlessSearchFinal HYPH 3352 221 16 Report.pdf report.pdf NN 3352 221 17 ( ( -LRB- 3352 221 18 accessed access VBN 3352 221 19 July July NNP 3352 221 20 18 18 CD 3352 221 21 , , , 3352 221 22 2006 2006 CD 3352 221 23 ) ) -RRB- 3352 221 24 ; ; : 3352 221 25 Michael Michael NNP 3352 221 26 Buckland Buckland NNP 3352 221 27 et et NNP 3352 221 28 al al NNP 3352 221 29 . . NNP 3352 221 30 , , , 3352 221 31 “ " `` 3352 221 32 Seamless Seamless NNP 3352 221 33 Searching Searching NNP 3352 221 34 of of IN 3352 221 35 Numeric Numeric NNP 3352 221 36 and and CC 3352 221 37 Textual Textual NNP 3352 221 38 Resources Resources NNPS 3352 221 39 : : : 3352 221 40 Fri- Fri- NNP 3352 221 41 day day NN 3352 221 42 Afternoon Afternoon NNP 3352 221 43 Seminar Seminar NNP 3352 221 44 , , , 3352 221 45 Feb. February NNP 3352 221 46 14 14 CD 3352 221 47 , , , 3352 221 48 2003 2003 CD 3352 221 49 , , , 3352 221 50 ” " '' 3352 221 51 http://metadata.sims http://metadata.sims NNP 3352 221 52 .berkeley.edu .berkeley.edu . 3352 221 53 / / SYM 3352 221 54 papers paper NNS 3352 221 55 / / SYM 3352 221 56 seamlessfri.ppt seamlessfri.ppt NNS 3352 221 57 ( ( -LRB- 3352 221 58 accessed access VBN 3352 221 59 July July NNP 3352 221 60 18 18 CD 3352 221 61 , , , 3352 221 62 2006 2006 CD 3352 221 63 ) ) -RRB- 3352 221 64 . . . 3352 222 1 2 2 LS 3352 222 2 . . . 3352 223 1 Michael Michael NNP 3352 223 2 Buckland Buckland NNP 3352 223 3 et et NNP 3352 223 4 al al NNP 3352 223 5 . . NNP 3352 223 6 , , , 3352 223 7 “ " `` 3352 223 8 Mapping Mapping NNP 3352 223 9 Entry Entry NNP 3352 223 10 Vocabulary Vocabulary NNP 3352 223 11 to to IN 3352 223 12 Unfamiliar Unfamiliar NNP 3352 223 13 Metadata Metadata NNP 3352 223 14 Vocabularies Vocabularies NNPS 3352 223 15 , , , 3352 223 16 ” " '' 3352 223 17 D D NNP 3352 223 18 - - HYPH 3352 223 19 Lib Lib NNP 3352 223 20 Magazine Magazine NNP 3352 223 21 5 5 CD 3352 223 22 , , , 3352 223 23 no no UH 3352 223 24 . . . 3352 224 1 1 1 LS 3352 224 2 ( ( -LRB- 3352 224 3 Jan. January NNP 3352 224 4 1999 1999 CD 3352 224 5 ) ) -RRB- 3352 224 6 , , , 3352 224 7 www.dlib.org/dlib/january99/buckland/01buckland www.dlib.org/dlib/january99/buckland/01buckland NNP 3352 224 8 .html .html NNP 3352 224 9 ( ( -LRB- 3352 224 10 accessed access VBN 3352 224 11 July July NNP 3352 224 12 18 18 CD 3352 224 13 , , , 3352 224 14 2006 2006 CD 3352 224 15 ) ) -RRB- 3352 224 16 ; ; : 3352 224 17 Michael Michael NNP 3352 224 18 Buckland Buckland NNP 3352 224 19 , , , 3352 224 20 “ " `` 3352 224 21 The the DT 3352 224 22 Sig- Sig- NNP 3352 224 23 nificance nificance NN 3352 224 24 of of IN 3352 224 25 Vocabulary Vocabulary NNP 3352 224 26 , , , 3352 224 27 ” " '' 3352 224 28 2000 2000 CD 3352 224 29 , , , 3352 224 30 http://metadata.sims.berkeley http://metadata.sims.berkeley NNP 3352 224 31 .edu .edu . 3352 224 32 / / SYM 3352 224 33 vocabsig.ppt vocabsig.ppt NNS 3352 224 34 ( ( -LRB- 3352 224 35 accessed access VBN 3352 224 36 July July NNP 3352 224 37 18 18 CD 3352 224 38 , , , 3352 224 39 2006 2006 CD 3352 224 40 ) ) -RRB- 3352 224 41 ; ; : 3352 224 42 Fredric Fredric NNP 3352 224 43 C. C. NNP 3352 224 44 Gey Gey NNP 3352 224 45 et et NNP 3352 224 46 al al NNP 3352 224 47 . . NNP 3352 224 48 , , , 3352 224 49 “ " `` 3352 224 50 Entry Entry NNP 3352 224 51 Vocabulary Vocabulary NNP 3352 224 52 : : : 3352 224 53 A a DT 3352 224 54 Technology Technology NNP 3352 224 55 to to TO 3352 224 56 Enhance enhance VB 3352 224 57 Digital Digital NNP 3352 224 58 Search Search NNP 3352 224 59 , , , 3352 224 60 ” " '' 3352 224 61 in in IN 3352 224 62 Proceedings Proceedings NNP 3352 224 63 of of IN 3352 224 64 the the DT 3352 224 65 First First NNP 3352 224 66 International International NNP 3352 224 67 Conference Conference NNP 3352 224 68 on on IN 3352 224 69 Human Human NNP 3352 224 70 Lan- Lan- NNP 3352 224 71 guage guage NN 3352 224 72 Technology Technology NNP 3352 224 73 , , , 3352 224 74 San San NNP 3352 224 75 Diego Diego NNP 3352 224 76 , , , 3352 224 77 Mar. March NNP 3352 225 1 2001 2001 CD 3352 225 2 ( ( -LRB- 3352 225 3 San San NNP 3352 225 4 Francisco Francisco NNP 3352 225 5 : : : 3352 225 6 Morgan Morgan NNP 3352 225 7 Kaufmann Kaufmann NNP 3352 225 8 , , , 3352 225 9 2001 2001 CD 3352 225 10 ) ) -RRB- 3352 225 11 , , , 3352 225 12 91–95 91–95 NNP 3352 225 13 , , , 3352 225 14 http://metadata.sims.berkeley.edu/ http://metadata.sims.berkeley.edu/ NNP 3352 225 15 papers paper NNS 3352 225 16 / / SYM 3352 225 17 hlt01-final.pdf hlt01-final.pdf NNS 3352 225 18 ( ( -LRB- 3352 225 19 accessed access VBN 3352 225 20 July July NNP 3352 225 21 18 18 CD 3352 225 22 , , , 3352 225 23 2006 2006 CD 3352 225 24 ) ) -RRB- 3352 225 25 . . . 3352 226 1 3 3 LS 3352 226 2 . . . 3352 227 1 Los Los NNP 3352 227 2 Angeles Angeles NNP 3352 227 3 Times Times NNP 3352 227 4 , , , 3352 227 5 July July NNP 3352 227 6 12 12 CD 3352 227 7 , , , 3352 227 8 1995 1995 CD 3352 227 9 : : : 3352 227 10 D1 d1 NN 3352 227 11 . . . 3352 228 1 4 4 LS 3352 228 2 . . . 3352 229 1 Michael Michael NNP 3352 229 2 Buckland Buckland NNP 3352 229 3 , , , 3352 229 4 “ " `` 3352 229 5 Vocabulary Vocabulary NNP 3352 229 6 As as IN 3352 229 7 a a DT 3352 229 8 Central Central NNP 3352 229 9 Concept Concept NNP 3352 229 10 in in IN 3352 229 11 Library Library NNP 3352 229 12 and and CC 3352 229 13 Information Information NNP 3352 229 14 Science Science NNP 3352 229 15 , , , 3352 229 16 ” " '' 3352 229 17 in in IN 3352 229 18 Digital Digital NNP 3352 229 19 Libraries Libraries NNPS 3352 229 20 : : : 3352 229 21 Interdisci- Interdisci- NNP 3352 229 22 plinary plinary JJ 3352 229 23 Concepts Concepts NNPS 3352 229 24 , , , 3352 229 25 Challenges Challenges NNPS 3352 229 26 , , , 3352 229 27 and and CC 3352 229 28 Opportunities opportunity NNS 3352 229 29 . . . 3352 230 1 Proceedings proceeding NNS 3352 230 2 of of IN 3352 230 3 the the DT 3352 230 4 Third Third NNP 3352 230 5 International International NNP 3352 230 6 Conference Conference NNP 3352 230 7 on on IN 3352 230 8 Conceptions Conceptions NNPS 3352 230 9 of of IN 3352 230 10 Library Library NNP 3352 230 11 and and CC 3352 230 12 Infor- Infor- NNP 3352 230 13 mation mation NN 3352 230 14 Science science NN 3352 230 15 ( ( -LRB- 3352 230 16 CoLIS3 CoLIS3 NNP 3352 230 17 ) ) -RRB- 3352 230 18 , , , 3352 230 19 Dubrovnik Dubrovnik NNP 3352 230 20 , , , 3352 230 21 Croatia Croatia NNP 3352 230 22 , , , 3352 230 23 May May NNP 3352 230 24 23–26 23–26 CD 3352 230 25 , , , 3352 230 26 1999 1999 CD 3352 230 27 , , , 3352 230 28 ed ed NN 3352 230 29 . . . 3352 231 1 T. T. NNP 3352 231 2 Arpanac Arpanac NNP 3352 231 3 et et FW 3352 231 4 al al NNP 3352 231 5 . . . 3352 232 1 ( ( -LRB- 3352 232 2 Lokve Lokve NNP 3352 232 3 , , , 3352 232 4 Croatia Croatia NNP 3352 232 5 : : : 3352 232 6 Benja Benja NNP 3352 232 7 Pubs Pubs NNPS 3352 232 8 . . . 3352 232 9 , , , 3352 232 10 1999 1999 CD 3352 232 11 ) ) -RRB- 3352 232 12 , , , 3352 232 13 3–12 3–12 NNP 3352 232 14 , , , 3352 232 15 www www NNP 3352 232 16 .sims.berkeley.edu/~buckland .sims.berkeley.edu/~buckland NNP 3352 232 17 / / SYM 3352 232 18 colisvoc.htm colisvoc.htm . 3352 232 19 ( ( -LRB- 3352 232 20 accessed access VBN 3352 232 21 July July NNP 3352 232 22 18 18 CD 3352 232 23 , , , 3352 232 24 2006 2006 CD 3352 232 25 ) ) -RRB- 3352 232 26 ; ; : 3352 232 27 Buckland Buckland NNP 3352 232 28 et et NNP 3352 232 29 al al NNP 3352 232 30 . . NNP 3352 232 31 , , , 3352 232 32 “ " `` 3352 232 33 Mapping Mapping NNP 3352 232 34 Entry Entry NNP 3352 232 35 Vocabulary Vocabulary NNP 3352 232 36 . . . 3352 232 37 ” " '' 3352 232 38 5 5 CD 3352 232 39 . . . 3352 233 1 Counting count VBG 3352 233 2 California California NNP 3352 233 3 , , , 3352 233 4 http://countingcalifornia.cdlib.org http://countingcalifornia.cdlib.org NNP 3352 233 5 ( ( -LRB- 3352 233 6 accessed access VBN 3352 233 7 July July NNP 3352 233 8 18 18 CD 3352 233 9 , , , 3352 233 10 2006 2006 CD 3352 233 11 ) ) -RRB- 3352 233 12 . . . 3352 234 1 6 6 CD 3352 234 2 . . . 3352 235 1 “ " `` 3352 235 2 Factsheet factsheet NN 3352 235 3 : : : 3352 235 4 Unified Unified NNP 3352 235 5 Medical Medical NNP 3352 235 6 Language Language NNP 3352 235 7 System System NNP 3352 235 8 , , , 3352 235 9 ” " '' 3352 235 10 www www NNP 3352 235 11 .nlm.nih.gov .nlm.nih.gov NN 3352 235 12 / / SYM 3352 235 13 pubs pub NNS 3352 235 14 / / SYM 3352 235 15 factsheets factsheet NNS 3352 235 16 / / SYM 3352 235 17 umls.html umls.html NNS 3352 235 18 ( ( -LRB- 3352 235 19 accessed access VBN 3352 235 20 July July NNP 3352 235 21 18 18 CD 3352 235 22 , , , 3352 235 23 2006 2006 CD 3352 235 24 ) ) -RRB- 3352 235 25 . . . 3352 236 1 7 7 LS 3352 236 2 . . . 3352 237 1 William William NNP 3352 237 2 S. S. NNP 3352 237 3 Cooper Cooper NNP 3352 237 4 , , , 3352 237 5 Aitao Aitao NNP 3352 237 6 Chen Chen NNP 3352 237 7 , , , 3352 237 8 and and CC 3352 237 9 Fredric Fredric NNP 3352 237 10 C. C. NNP 3352 237 11 Gey Gey NNP 3352 237 12 , , , 3352 237 13 “ " `` 3352 237 14 Full- Full- NNP 3352 237 15 Text Text NNP 3352 237 16 Retrieval retrieval NN 3352 237 17 Based base VBN 3352 237 18 on on IN 3352 237 19 Probabilistic probabilistic JJ 3352 237 20 Equations equation NNS 3352 237 21 with with IN 3352 237 22 Coefficients coefficient NNS 3352 237 23 Fitted fit VBN 3352 237 24 by by IN 3352 237 25 Logistic Logistic NNP 3352 237 26 Regression Regression NNP 3352 237 27 , , , 3352 237 28 ” " '' 3352 237 29 in in IN 3352 237 30 D. D. NNP 3352 237 31 K. K. NNP 3352 237 32 Harman Harman NNP 3352 237 33 , , , 3352 237 34 ed ed NNP 3352 237 35 . . NNP 3352 237 36 , , , 3352 237 37 The the DT 3352 237 38 Second Second NNP 3352 237 39 Text Text NNP 3352 237 40 REtrieval REtrieval NNP 3352 237 41 Conference Conference NNP 3352 237 42 ( ( -LRB- 3352 237 43 TREC-2 TREC-2 NNP 3352 237 44 ) ) -RRB- 3352 237 45 , , , 3352 237 46 March March NNP 3352 237 47 1994 1994 CD 3352 237 48 , , , 3352 237 49 57–66 57–66 NNP 3352 237 50 ( ( -LRB- 3352 237 51 Gaith- Gaith- NNP 3352 237 52 ersburg ersburg NN 3352 237 53 , , , 3352 237 54 Md. Md. NNP 3352 238 1 : : : 3352 238 2 National National NNP 3352 238 3 Institute Institute NNP 3352 238 4 of of IN 3352 238 5 Standards Standards NNPS 3352 238 6 and and CC 3352 238 7 Technol- Technol- NNP 3352 238 8 ogy ogy NN 3352 238 9 , , , 3352 238 10 1994 1994 CD 3352 238 11 ) ) -RRB- 3352 238 12 , , , 3352 238 13 http://trec.nist.gov/pubs/trec2/papers/txt/05.txt http://trec.nist.gov/pubs/trec2/papers/txt/05.txt NN 3352 238 14 ( ( -LRB- 3352 238 15 accessed access VBN 3352 238 16 July July NNP 3352 238 17 18 18 CD 3352 238 18 , , , 3352 238 19 2006 2006 CD 3352 238 20 ) ) -RRB- 3352 238 21 . . . 3352 239 1 8 8 LS 3352 239 2 . . . 3352 240 1 “ " `` 3352 240 2 Going go VBG 3352 240 3 Places Places NNPS 3352 240 4 in in IN 3352 240 5 the the DT 3352 240 6 Catalog catalog NN 3352 240 7 : : : 3352 240 8 Improved Improved NNP 3352 240 9 Geographical Geographical NNP 3352 240 10 Access Access NNP 3352 240 11 , , , 3352 240 12 ” " '' 3352 240 13 http://ecai.org/imls2002 http://ecai.org/imls2002 NN 3352 240 14 ( ( -LRB- 3352 240 15 accessed access VBN 3352 240 16 Jul. July NNP 3352 241 1 18 18 CD 3352 241 2 , , , 3352 241 3 2006 2006 CD 3352 241 4 ) ) -RRB- 3352 241 5 . . . 3352 242 1 9 9 CD 3352 242 2 . . . 3352 243 1 Vivien Vivien NNP 3352 243 2 Petras Petras NNP 3352 243 3 , , , 3352 243 4 Ray Ray NNP 3352 243 5 Larson Larson NNP 3352 243 6 , , , 3352 243 7 and and CC 3352 243 8 Michael Michael NNP 3352 243 9 Buckland Buckland NNP 3352 243 10 , , , 3352 243 11 “ " `` 3352 243 12 Time Time NNP 3352 243 13 Period Period NNP 3352 243 14 Directories Directories NNPS 3352 243 15 : : : 3352 243 16 A a DT 3352 243 17 Metadata Metadata NNP 3352 243 18 Infrastructure infrastructure NN 3352 243 19 for for IN 3352 243 20 Placing place VBG 3352 243 21 Events event NNS 3352 243 22 in in IN 3352 243 23 Temporal Temporal NNP 3352 243 24 and and CC 3352 243 25 Geographic Geographic NNP 3352 243 26 Context Context NNP 3352 243 27 , , , 3352 243 28 ” " '' 3352 243 29 in in IN 3352 243 30 Opening Opening NNP 3352 243 31 Information Information NNP 3352 243 32 Horizons Horizons NNP 3352 243 33 : : : 3352 243 34 Joint Joint NNP 3352 243 35 Conference Conference NNP 3352 243 36 on on IN 3352 243 37 Digital Digital NNP 3352 243 38 Libraries Libraries NNPS 3352 243 39 ( ( -LRB- 3352 243 40 JCDL JCDL NNP 3352 243 41 ) ) -RRB- 3352 243 42 , , , 3352 243 43 Chapel Chapel NNP 3352 243 44 Hill Hill NNP 3352 243 45 , , , 3352 243 46 N.C. North Carolina NNP 3352 243 47 , , , 3352 243 48 June June NNP 3352 243 49 11–15 11–15 CD 3352 243 50 , , , 3352 243 51 2006 2006 CD 3352 243 52 , , , 3352 243 53 forthcoming forthcoming JJ 3352 243 54 , , , 3352 243 55 http://metadata.sims http://metadata.sim NNS 3352 243 56 .berkeley.edu .berkeley.edu . 3352 243 57 / / SYM 3352 243 58 tpdJCDL06.pdf tpdJCDL06.pdf NNS 3352 243 59 ( ( -LRB- 3352 243 60 accessed access VBN 3352 243 61 July July NNP 3352 243 62 18 18 CD 3352 243 63 , , , 3352 243 64 2006 2006 CD 3352 243 65 ) ) -RRB- 3352 243 66 ; ; : 3352 243 67 “ " `` 3352 243 68 Support support NN 3352 243 69 for for IN 3352 243 70 the the DT 3352 243 71 Learner Learner NNP 3352 243 72 : : : 3352 243 73 What what WP 3352 243 74 , , , 3352 243 75 Where where WRB 3352 243 76 , , , 3352 243 77 When when WRB 3352 243 78 , , , 3352 243 79 and and CC 3352 243 80 Who who WP 3352 243 81 , , , 3352 243 82 ” " '' 3352 243 83 http://ecai http://ecai ADD 3352 243 84 .org .org . 3352 243 85 / / NFP 3352 243 86 imls2004 imls2004 NNP 3352 243 87 ( ( -LRB- 3352 243 88 accessed access VBN 3352 243 89 July July NNP 3352 243 90 18 18 CD 3352 243 91 , , , 3352 243 92 2006 2006 CD 3352 243 93 ) ) -RRB- 3352 243 94 . . . 3352 244 1 SEARCH SEARCH NNS 3352 244 2 ACROSS acros VBD 3352 244 3 DIFFERENT DIFFERENT NNP 3352 244 4 MEDIA medium NNS 3352 244 5 | | CD 3352 244 6 BUCKLAND BUCKLAND NNP 3352 244 7 , , , 3352 244 8 CHEN CHEN NNP 3352 244 9 , , , 3352 244 10 GEY GEY NNP 3352 244 11 , , , 3352 244 12 AND and CC 3352 244 13 LARSON LARSON NNP 3352 244 14 189 189 CD 3352 244 15 Appendix Appendix NNP 3352 244 16 : : : 3352 244 17 Statistical statistical JJ 3352 244 18 association association NN 3352 244 19 methodology methodology NN 3352 244 20 A a DT 3352 244 21 statistical statistical JJ 3352 244 22 maximum maximum JJ 3352 244 23 likelihood likelihood NN 3352 244 24 ratio ratio NN 3352 244 25 weighting weight VBG 3352 244 26 tech- tech- JJ 3352 244 27 nique nique NNP 3352 244 28 was be VBD 3352 244 29 used use VBN 3352 244 30 to to TO 3352 244 31 construct construct VB 3352 244 32 a a DT 3352 244 33 two two CD 3352 244 34 - - HYPH 3352 244 35 way way NN 3352 244 36 contingency contingency NN 3352 244 37 table table NN 3352 244 38 relating relate VBG 3352 244 39 each each DT 3352 244 40 natural natural JJ 3352 244 41 - - HYPH 3352 244 42 language language NN 3352 244 43 term term NN 3352 244 44 ( ( -LRB- 3352 244 45 word word NN 3352 244 46 or or CC 3352 244 47 phrase phrase NN 3352 244 48 ) ) -RRB- 3352 244 49 with with IN 3352 244 50 each each DT 3352 244 51 value value NN 3352 244 52 in in IN 3352 244 53 the the DT 3352 244 54 metadata metadata NN 3352 244 55 vocabulary vocabulary NN 3352 244 56 of of IN 3352 244 57 a a DT 3352 244 58 resource resource NN 3352 244 59 , , , 3352 244 60 e.g. e.g. RB 3352 244 61 , , , 3352 244 62 LCSH LCSH NNP 3352 244 63 , , , 3352 244 64 LCCNs LCCNs NNP 3352 244 65 , , , 3352 244 66 U.S. U.S. NNP 3352 244 67 Patent Patent NNP 3352 244 68 Classification Classification NNP 3352 244 69 Numbers Numbers NNPS 3352 244 70 , , , 3352 244 71 and and CC 3352 244 72 so so RB 3352 244 73 on.1 on.1 VBZ 3352 244 74 An an DT 3352 244 75 associative associative JJ 3352 244 76 dictionary dictionary NN 3352 244 77 that that WDT 3352 244 78 will will MD 3352 244 79 map map VB 3352 244 80 words word NNS 3352 244 81 in in IN 3352 244 82 natural natural JJ 3352 244 83 languages language NNS 3352 244 84 into into IN 3352 244 85 metadata metadata NN 3352 244 86 terms term NNS 3352 244 87 can can MD 3352 244 88 also also RB 3352 244 89 , , , 3352 244 90 in in IN 3352 244 91 reverse reverse NN 3352 244 92 , , , 3352 244 93 return return NN 3352 244 94 words word NNS 3352 244 95 in in IN 3352 244 96 natural natural JJ 3352 244 97 language language NN 3352 244 98 that that WDT 3352 244 99 are be VBP 3352 244 100 closely closely RB 3352 244 101 associated associate VBN 3352 244 102 with with IN 3352 244 103 a a DT 3352 244 104 metadata metadata NN 3352 244 105 value value NN 3352 244 106 . . . 3352 245 1 Training training NN 3352 245 2 records record NNS 3352 245 3 containing contain VBG 3352 245 4 two two CD 3352 245 5 different different JJ 3352 245 6 metadata metadata NN 3352 245 7 vocabularies vocabulary NNS 3352 245 8 can can MD 3352 245 9 be be VB 3352 245 10 used use VBN 3352 245 11 to to TO 3352 245 12 create create VB 3352 245 13 direct direct JJ 3352 245 14 mappings mapping NNS 3352 245 15 between between IN 3352 245 16 the the DT 3352 245 17 values value NNS 3352 245 18 of of IN 3352 245 19 the the DT 3352 245 20 two two CD 3352 245 21 metadata metadata NN 3352 245 22 vocabularies vocabulary NNS 3352 245 23 . . . 3352 246 1 For for IN 3352 246 2 example example NN 3352 246 3 , , , 3352 246 4 U.S. U.S. NNP 3352 246 5 patents patent NNS 3352 246 6 contain contain VBP 3352 246 7 both both DT 3352 246 8 U.S. U.S. NNP 3352 246 9 and and CC 3352 246 10 International International NNP 3352 246 11 Patent Patent NNP 3352 246 12 Classification Classification NNP 3352 246 13 numbers number NNS 3352 246 14 and and CC 3352 246 15 so so RB 3352 246 16 can can MD 3352 246 17 be be VB 3352 246 18 used use VBN 3352 246 19 to to TO 3352 246 20 create create VB 3352 246 21 a a DT 3352 246 22 mapping mapping NN 3352 246 23 between between IN 3352 246 24 these these DT 3352 246 25 two two CD 3352 246 26 quite quite RB 3352 246 27 different different JJ 3352 246 28 classifica- classifica- NN 3352 246 29 tions tion NNS 3352 246 30 . . . 3352 247 1 Multilingual multilingual JJ 3352 247 2 training training NN 3352 247 3 sets set NNS 3352 247 4 , , , 3352 247 5 such such JJ 3352 247 6 as as IN 3352 247 7 catalog catalog NN 3352 247 8 records record NNS 3352 247 9 for for IN 3352 247 10 multilingual multilingual JJ 3352 247 11 library library NN 3352 247 12 collections collection NNS 3352 247 13 , , , 3352 247 14 can can MD 3352 247 15 be be VB 3352 247 16 used use VBN 3352 247 17 to to TO 3352 247 18 create create VB 3352 247 19 multilingual multilingual JJ 3352 247 20 natural natural JJ 3352 247 21 language language NN 3352 247 22 indexes index NNS 3352 247 23 to to TO 3352 247 24 metadata metadata VB 3352 247 25 vocabu- vocabu- JJ 3352 247 26 laries larie NNS 3352 247 27 and and CC 3352 247 28 , , , 3352 247 29 also also RB 3352 247 30 , , , 3352 247 31 mappings mapping NNS 3352 247 32 between between IN 3352 247 33 natural natural JJ 3352 247 34 language language NN 3352 247 35 vocabularies vocabulary NNS 3352 247 36 . . . 3352 248 1 In in IN 3352 248 2 addition addition NN 3352 248 3 to to IN 3352 248 4 the the DT 3352 248 5 maximum maximum JJ 3352 248 6 likelihood likelihood NN 3352 248 7 ratio ratio NN 3352 248 8 - - HYPH 3352 248 9 based base VBN 3352 248 10 association association NN 3352 248 11 measure measure NN 3352 248 12 , , , 3352 248 13 there there EX 3352 248 14 are be VBP 3352 248 15 a a DT 3352 248 16 number number NN 3352 248 17 of of IN 3352 248 18 other other JJ 3352 248 19 asso- asso- NN 3352 248 20 ciation ciation NN 3352 248 21 measures measure NNS 3352 248 22 , , , 3352 248 23 such such JJ 3352 248 24 as as IN 3352 248 25 the the DT 3352 248 26 Chi Chi NNP 3352 248 27 - - HYPH 3352 248 28 square square JJ 3352 248 29 statistic statistic NN 3352 248 30 , , , 3352 248 31 mutual mutual JJ 3352 248 32 information information NN 3352 248 33 measure measure NN 3352 248 34 , , , 3352 248 35 and and CC 3352 248 36 so so RB 3352 248 37 on on RB 3352 248 38 , , , 3352 248 39 that that DT 3352 248 40 can can MD 3352 248 41 be be VB 3352 248 42 used use VBN 3352 248 43 in in IN 3352 248 44 creat- creat- NNP 3352 248 45 ing ing NNP 3352 248 46 association association NNP 3352 248 47 dictionaries dictionary NNS 3352 248 48 . . . 3352 249 1 The the DT 3352 249 2 training training NN 3352 249 3 set set NN 3352 249 4 used use VBN 3352 249 5 to to TO 3352 249 6 create create VB 3352 249 7 the the DT 3352 249 8 word word NN 3352 249 9 - - HYPH 3352 249 10 to to IN 3352 249 11 - - HYPH 3352 249 12 LCSH LCSH NNP 3352 249 13 EVI EVI NNP 3352 249 14 was be VBD 3352 249 15 a a DT 3352 249 16 set set NN 3352 249 17 of of IN 3352 249 18 catalog catalog NN 3352 249 19 records record NNS 3352 249 20 with with IN 3352 249 21 at at RB 3352 249 22 least least RBS 3352 249 23 one one CD 3352 249 24 assigned assign VBN 3352 249 25 LCSH LCSH NNP 3352 249 26 ( ( -LRB- 3352 249 27 i.e. i.e. FW 3352 249 28 , , , 3352 249 29 at at IN 3352 249 30 least least RBS 3352 249 31 one one CD 3352 249 32 6xx 6xx JJ 3352 249 33 field field NN 3352 249 34 ) ) -RRB- 3352 249 35 . . . 3352 250 1 Natural natural JJ 3352 250 2 language language NN 3352 250 3 terms term NNS 3352 250 4 were be VBD 3352 250 5 extracted extract VBN 3352 250 6 from from IN 3352 250 7 the the DT 3352 250 8 title title NN 3352 250 9 ( ( -LRB- 3352 250 10 field field NN 3352 250 11 245a 245a NNPS 3352 250 12 ) ) -RRB- 3352 250 13 , , , 3352 250 14 subtitle subtitle NNP 3352 250 15 ( ( -LRB- 3352 250 16 245b 245b NNPS 3352 250 17 ) ) -RRB- 3352 250 18 , , , 3352 250 19 and and CC 3352 250 20 summary summary NN 3352 250 21 note note NN 3352 250 22 ( ( -LRB- 3352 250 23 520a 520a NN 3352 250 24 ) ) -RRB- 3352 250 25 . . . 3352 251 1 These these DT 3352 251 2 terms term NNS 3352 251 3 were be VBD 3352 251 4 tokenized tokenize VBN 3352 251 5 ; ; : 3352 251 6 the the DT 3352 251 7 stopwords stopword NNS 3352 251 8 were be VBD 3352 251 9 removed remove VBN 3352 251 10 ; ; : 3352 251 11 and and CC 3352 251 12 the the DT 3352 251 13 remaining remain VBG 3352 251 14 words word NNS 3352 251 15 were be VBD 3352 251 16 normalized normalized JJ 3352 251 17 . . . 3352 252 1 A a DT 3352 252 2 token token NN 3352 252 3 here here RB 3352 252 4 can can MD 3352 252 5 contain contain VB 3352 252 6 only only JJ 3352 252 7 letters letter NNS 3352 252 8 and and CC 3352 252 9 digits digit NNS 3352 252 10 . . . 3352 253 1 All all DT 3352 253 2 tokens token NNS 3352 253 3 were be VBD 3352 253 4 then then RB 3352 253 5 changed change VBN 3352 253 6 to to IN 3352 253 7 lower low JJR 3352 253 8 case case NN 3352 253 9 . . . 3352 254 1 The the DT 3352 254 2 stoplist stoplist NN 3352 254 3 has have VBZ 3352 254 4 about about RB 3352 254 5 six six CD 3352 254 6 hundred hundred CD 3352 254 7 words word NNS 3352 254 8 considered consider VBN 3352 254 9 not not RB 3352 254 10 to to TO 3352 254 11 be be VB 3352 254 12 content content NN 3352 254 13 bearing bearing NN 3352 254 14 , , , 3352 254 15 such such JJ 3352 254 16 as as IN 3352 254 17 pronouns pronoun NNS 3352 254 18 , , , 3352 254 19 prepositions preposition NNS 3352 254 20 , , , 3352 254 21 coordinators coordinator NNS 3352 254 22 , , , 3352 254 23 determiners determiner NNS 3352 254 24 , , , 3352 254 25 and and CC 3352 254 26 the the DT 3352 254 27 like like JJ 3352 254 28 . . . 3352 255 1 The the DT 3352 255 2 content content NN 3352 255 3 words word NNS 3352 255 4 ( ( -LRB- 3352 255 5 those those DT 3352 255 6 not not RB 3352 255 7 treated treat VBN 3352 255 8 as as IN 3352 255 9 stopwords stopword NNS 3352 255 10 ) ) -RRB- 3352 255 11 were be VBD 3352 255 12 normalized normalized JJ 3352 255 13 using use VBG 3352 255 14 a a DT 3352 255 15 table table NN 3352 255 16 derived derive VBN 3352 255 17 from from IN 3352 255 18 an an DT 3352 255 19 English english JJ 3352 255 20 morphological morphological NN 3352 255 21 analyzer.2 analyzer.2 NN 3352 255 22 The the DT 3352 255 23 table table NN 3352 255 24 maps map VBZ 3352 255 25 plural plural JJ 3352 255 26 nouns noun NNS 3352 255 27 into into IN 3352 255 28 singular singular JJ 3352 255 29 ones one NNS 3352 255 30 ; ; , 3352 255 31 verbs verb NNS 3352 255 32 into into IN 3352 255 33 the the DT 3352 255 34 infinitive infinitive JJ 3352 255 35 form form NN 3352 255 36 ; ; , 3352 255 37 and and CC 3352 255 38 comparative comparative JJ 3352 255 39 and and CC 3352 255 40 superlative superlative JJ 3352 255 41 adjectives adjective NNS 3352 255 42 to to IN 3352 255 43 the the DT 3352 255 44 positive positive JJ 3352 255 45 form form NN 3352 255 46 . . . 3352 256 1 For for IN 3352 256 2 example example NN 3352 256 3 , , , 3352 256 4 the the DT 3352 256 5 plural plural JJ 3352 256 6 noun noun NNP 3352 256 7 printers printer NNS 3352 256 8 is be VBZ 3352 256 9 reduced reduce VBN 3352 256 10 to to TO 3352 256 11 printer printer VB 3352 256 12 , , , 3352 256 13 and and CC 3352 256 14 children child NNS 3352 256 15 to to IN 3352 256 16 child child VB 3352 256 17 ; ; : 3352 256 18 the the DT 3352 256 19 comparative comparative JJ 3352 256 20 adjective adjective NN 3352 256 21 longer long RBR 3352 256 22 and and CC 3352 256 23 the the DT 3352 256 24 superlative superlative JJ 3352 256 25 adjective adjective NN 3352 256 26 longest long JJS 3352 256 27 are be VBP 3352 256 28 reduced reduce VBN 3352 256 29 to to IN 3352 256 30 long long RB 3352 256 31 ; ; : 3352 256 32 and and CC 3352 256 33 printing printing NN 3352 256 34 , , , 3352 256 35 printed print VBN 3352 256 36 , , , 3352 256 37 and and CC 3352 256 38 prints print NNS 3352 256 39 are be VBP 3352 256 40 all all DT 3352 256 41 reduced reduce VBN 3352 256 42 to to IN 3352 256 43 the the DT 3352 256 44 same same JJ 3352 256 45 base base NN 3352 256 46 form form NN 3352 256 47 print print NN 3352 256 48 . . . 3352 257 1 When when WRB 3352 257 2 a a DT 3352 257 3 word word NN 3352 257 4 belonging belong VBG 3352 257 5 to to IN 3352 257 6 more more JJR 3352 257 7 than than IN 3352 257 8 one one CD 3352 257 9 part part NN 3352 257 10 - - HYPH 3352 257 11 of of IN 3352 257 12 - - HYPH 3352 257 13 speech speech NN 3352 257 14 category category NN 3352 257 15 can can MD 3352 257 16 be be VB 3352 257 17 reduced reduce VBN 3352 257 18 to to IN 3352 257 19 more more JJR 3352 257 20 than than IN 3352 257 21 one one CD 3352 257 22 form form NN 3352 257 23 , , , 3352 257 24 it -PRON- PRP 3352 257 25 is be VBZ 3352 257 26 changed change VBN 3352 257 27 to to IN 3352 257 28 the the DT 3352 257 29 first first JJ 3352 257 30 form form NN 3352 257 31 listed list VBN 3352 257 32 in in IN 3352 257 33 the the DT 3352 257 34 morphological morphological JJ 3352 257 35 analyzer analyzer NN 3352 257 36 table table NN 3352 257 37 . . . 3352 258 1 As as IN 3352 258 2 an an DT 3352 258 3 example example NN 3352 258 4 , , , 3352 258 5 the the DT 3352 258 6 word word NN 3352 258 7 saw see VBD 3352 258 8 , , , 3352 258 9 which which WDT 3352 258 10 can can MD 3352 258 11 be be VB 3352 258 12 a a DT 3352 258 13 noun noun NN 3352 258 14 or or CC 3352 258 15 the the DT 3352 258 16 past past JJ 3352 258 17 tense tense NN 3352 258 18 of of IN 3352 258 19 the the DT 3352 258 20 verb verb NN 3352 258 21 to to TO 3352 258 22 see see VB 3352 258 23 , , , 3352 258 24 is be VBZ 3352 258 25 not not RB 3352 258 26 reduced reduce VBN 3352 258 27 to to TO 3352 258 28 see see VB 3352 258 29 . . . 3352 259 1 Subject subject JJ 3352 259 2 headings heading NNS 3352 259 3 ( ( -LRB- 3352 259 4 field field NN 3352 259 5 6xxa 6xxa CD 3352 259 6 ) ) -RRB- 3352 259 7 were be VBD 3352 259 8 extracted extract VBN 3352 259 9 without without IN 3352 259 10 qualifying qualify VBG 3352 259 11 subdivisions subdivision NNS 3352 259 12 . . . 3352 260 1 The the DT 3352 260 2 inclusion inclusion NN 3352 260 3 of of IN 3352 260 4 foreign foreign JJ 3352 260 5 words word NNS 3352 260 6 ( ( -LRB- 3352 260 7 alcoholismo alcoholismo NNP 3352 260 8 , , , 3352 260 9 alcoolisme alcoolisme NNP 3352 260 10 , , , 3352 260 11 alkohol alkohol NNP 3352 260 12 , , , 3352 260 13 and and CC 3352 260 14 alcool alcool NNP 3352 260 15 ) ) -RRB- 3352 260 16 , , , 3352 260 17 derived derive VBN 3352 260 18 from from IN 3352 260 19 titles title NNS 3352 260 20 in in IN 3352 260 21 foreign foreign JJ 3352 260 22 languages language NNS 3352 260 23 , , , 3352 260 24 demonstrate demonstrate VBP 3352 260 25 that that IN 3352 260 26 the the DT 3352 260 27 technique technique NN 3352 260 28 is be VBZ 3352 260 29 language language NN 3352 260 30 independent independent JJ 3352 260 31 and and CC 3352 260 32 could could MD 3352 260 33 be be VB 3352 260 34 adopted adopt VBN 3352 260 35 in in IN 3352 260 36 any any DT 3352 260 37 country country NN 3352 260 38 . . . 3352 261 1 It -PRON- PRP 3352 261 2 could could MD 3352 261 3 also also RB 3352 261 4 support support VB 3352 261 5 diversity diversity NN 3352 261 6 in in IN 3352 261 7 U.S. U.S. NNP 3352 261 8 libraries library NNS 3352 261 9 by by IN 3352 261 10 allowing allow VBG 3352 261 11 searches search NNS 3352 261 12 in in IN 3352 261 13 Spanish spanish JJ 3352 261 14 or or CC 3352 261 15 other other JJ 3352 261 16 languages language NNS 3352 261 17 , , , 3352 261 18 so so RB 3352 261 19 long long RB 3352 261 20 as as IN 3352 261 21 the the DT 3352 261 22 training training NN 3352 261 23 set set NN 3352 261 24 contains contain VBZ 3352 261 25 sufficient sufficient JJ 3352 261 26 content content NN 3352 261 27 words word NNS 3352 261 28 . . . 3352 262 1 EVIs evi NNS 3352 262 2 are be VBP 3352 262 3 accessible accessible JJ 3352 262 4 at at IN 3352 262 5 http://metadata http://metadata ADD 3352 262 6 . . . 3352 263 1 sims.berkeley.edu/prototypesI.html sims.berkeley.edu/prototypesi.html ADD 3352 263 2 . . . 3352 264 1 Fuller full JJR 3352 264 2 descriptions description NNS 3352 264 3 of of IN 3352 264 4 the the DT 3352 264 5 project project NN 3352 264 6 methodology methodology NN 3352 264 7 can can MD 3352 264 8 be be VB 3352 264 9 found find VBN 3352 264 10 in in IN 3352 264 11 the the DT 3352 264 12 literature.3 literature.3 CD 3352 264 13 ■ ■ NFP 3352 264 14 References reference NNS 3352 264 15 1 1 CD 3352 264 16 . . . 3352 265 1 Ted Ted NNP 3352 265 2 Dunning Dunning NNP 3352 265 3 , , , 3352 265 4 “ " `` 3352 265 5 Accurate accurate JJ 3352 265 6 Methods Methods NNPS 3352 265 7 for for IN 3352 265 8 the the DT 3352 265 9 Statistics Statistics NNPS 3352 265 10 of of IN 3352 265 11 Surprise Surprise NNP 3352 265 12 and and CC 3352 265 13 Coincidence Coincidence NNP 3352 265 14 , , , 3352 265 15 ” " '' 3352 265 16 Computational Computational NNP 3352 265 17 Linguistics Linguistics NNP 3352 265 18 19 19 CD 3352 265 19 ( ( -LRB- 3352 265 20 March March NNP 3352 265 21 1993 1993 CD 3352 265 22 ) ) -RRB- 3352 265 23 : : : 3352 265 24 61–74 61–74 CD 3352 265 25 . . . 3352 266 1 2 2 LS 3352 266 2 . . . 3352 267 1 Daniel Daniel NNP 3352 267 2 Karp Karp NNP 3352 267 3 et et FW 3352 267 4 al al NNP 3352 267 5 . . NNP 3352 267 6 , , , 3352 267 7 “ " `` 3352 267 8 A a DT 3352 267 9 Freely freely RB 3352 267 10 Available available JJ 3352 267 11 Wide wide JJ 3352 267 12 Cover- Cover- NNP 3352 267 13 age age NN 3352 267 14 Morphological Morphological NNP 3352 267 15 Analyzer Analyzer NNP 3352 267 16 for for IN 3352 267 17 English English NNP 3352 267 18 , , , 3352 267 19 ” " '' 3352 267 20 in in IN 3352 267 21 Proceedings Proceedings NNP 3352 267 22 of of IN 3352 267 23 COLING-92 COLING-92 NNP 3352 267 24 , , , 3352 267 25 Nantes Nantes NNP 3352 267 26 , , , 3352 267 27 1992 1992 CD 3352 267 28 ( ( -LRB- 3352 267 29 Morristown Morristown NNP 3352 267 30 , , , 3352 267 31 N.J. New Jersey NNP 3352 267 32 : : : 3352 267 33 Association Association NNP 3352 267 34 for for IN 3352 267 35 Computational Computational NNP 3352 267 36 Linguistics Linguistics NNP 3352 267 37 , , , 3352 267 38 1992 1992 CD 3352 267 39 ) ) -RRB- 3352 267 40 , , , 3352 267 41 950–55 950–55 CD 3352 267 42 , , , 3352 267 43 http://acl.ldc.upenn http://acl.ldc.upenn NNP 3352 267 44 .edu .edu NNP 3352 267 45 / / SYM 3352 267 46 C C NNP 3352 267 47 / / SYM 3352 267 48 C92 C92 NNP 3352 267 49 / / SYM 3352 267 50 C92 C92 NNP 3352 267 51 - - HYPH 3352 267 52 3145.pdf 3145.pdf NNP 3352 267 53 ( ( -LRB- 3352 267 54 accessed access VBN 3352 267 55 July July NNP 3352 267 56 18 18 CD 3352 267 57 , , , 3352 267 58 2006 2006 CD 3352 267 59 ) ) -RRB- 3352 267 60 . . . 3352 268 1 3 3 LS 3352 268 2 . . . 3352 269 1 Michael Michael NNP 3352 269 2 K. K. NNP 3352 269 3 Buckland Buckland NNP 3352 269 4 , , , 3352 269 5 Fredric Fredric NNP 3352 269 6 C. C. NNP 3352 269 7 Gey Gey NNP 3352 269 8 , , , 3352 269 9 and and CC 3352 269 10 Ray Ray NNP 3352 269 11 R. R. NNP 3352 269 12 Larson Larson NNP 3352 269 13 , , , 3352 269 14 Seamless Seamless NNP 3352 269 15 Searching Searching NNP 3352 269 16 of of IN 3352 269 17 Numeric Numeric NNP 3352 269 18 and and CC 3352 269 19 Textual Textual NNP 3352 269 20 Resources Resources NNPS 3352 269 21 : : : 3352 269 22 Final Final NNP 3352 269 23 Report Report NNP 3352 269 24 on on IN 3352 269 25 Institute Institute NNP 3352 269 26 of of IN 3352 269 27 Museum Museum NNP 3352 269 28 and and CC 3352 269 29 Library Library NNP 3352 269 30 Services Services NNPS 3352 269 31 National National NNP 3352 269 32 Leadership Leadership NNP 3352 269 33 Grant Grant NNP 3352 269 34 No No NNP 3352 269 35 . . . 3352 270 1 178 178 CD 3352 270 2 ( ( -LRB- 3352 270 3 Berkeley Berkeley NNP 3352 270 4 , , , 3352 270 5 Calif. California NNP 3352 270 6 : : : 3352 270 7 Univ Univ NNP 3352 270 8 . . . 3352 271 1 of of IN 3352 271 2 California California NNP 3352 271 3 , , , 3352 271 4 School School NNP 3352 271 5 of of IN 3352 271 6 Informa- Informa- NNP 3352 271 7 tion tion NN 3352 271 8 Management Management NNP 3352 271 9 and and CC 3352 271 10 Systems Systems NNPS 3352 271 11 , , , 3352 271 12 2002 2002 CD 3352 271 13 ) ) -RRB- 3352 271 14 , , , 3352 271 15 http://metadata.sims http://metadata.sim NNS 3352 271 16 .berkeley.edu .berkeley.edu . 3352 271 17 / / SYM 3352 271 18 papers paper NNS 3352 271 19 / / SYM 3352 271 20 SeamlessSearchFinalReport.pdf SeamlessSearchFinalReport.pdf NNP 3352 271 21 ( ( -LRB- 3352 271 22 accessed access VBN 3352 271 23 Jul. July NNP 3352 272 1 18 18 CD 3352 272 2 , , , 3352 272 3 2006 2006 CD 3352 272 4 ) ) -RRB- 3352 272 5 ; ; : 3352 272 6 Youngin Youngin NNP 3352 272 7 Kim Kim NNP 3352 272 8 et et FW 3352 272 9 al al NNP 3352 272 10 . . NNP 3352 272 11 , , , 3352 272 12 “ " `` 3352 272 13 Using use VBG 3352 272 14 Ordinary ordinary JJ 3352 272 15 Language Language NNP 3352 272 16 to to IN 3352 272 17 Access access NN 3352 272 18 Metadata Metadata NNP 3352 272 19 of of IN 3352 272 20 Diverse diverse JJ 3352 272 21 Types type NNS 3352 272 22 of of IN 3352 272 23 Information Information NNP 3352 272 24 Resources Resources NNPS 3352 272 25 : : : 3352 272 26 Trade Trade NNP 3352 272 27 Classification Classification NNP 3352 272 28 and and CC 3352 272 29 Numeric Numeric NNP 3352 272 30 Data Data NNP 3352 272 31 , , , 3352 272 32 ” " '' 3352 272 33 in in IN 3352 272 34 Knowledge knowledge NN 3352 272 35 : : : 3352 272 36 Creation Creation NNP 3352 272 37 , , , 3352 272 38 Organization Organization NNP 3352 272 39 , , , 3352 272 40 and and CC 3352 272 41 Use Use NNP 3352 272 42 . . . 3352 273 1 Proceedings proceeding NNS 3352 273 2 of of IN 3352 273 3 the the DT 3352 273 4 American American NNP 3352 273 5 Society Society NNP 3352 273 6 for for IN 3352 273 7 Infor- Infor- NNP 3352 273 8 mation mation NN 3352 273 9 Science Science NNP 3352 273 10 Annual Annual NNP 3352 273 11 Meeting Meeting NNP 3352 273 12 , , , 3352 273 13 Oct. October NNP 3352 273 14 29–Nov 29–Nov NNP 3352 273 15 . . . 3352 274 1 4 4 CD 3352 274 2 , , , 3352 274 3 1999 1999 CD 3352 274 4 ( ( -LRB- 3352 274 5 Medford Medford NNP 3352 274 6 , , , 3352 274 7 N.J. New Jersey NNP 3352 274 8 : : : 3352 274 9 Information Information NNP 3352 274 10 Today Today NNP 3352 274 11 , , , 3352 274 12 1999 1999 CD 3352 274 13 ) ) -RRB- 3352 274 14 , , , 3352 274 15 172–80 172–80 CD 3352 274 16 . . .