id sid tid token lemma pos cord-020871-1v6dcmt3 1 1 key key NN cord-020871-1v6dcmt3 1 2 : : : cord-020871-1v6dcmt3 2 1 cord-020871 cord-020871 NNP cord-020871-1v6dcmt3 2 2 - - HYPH cord-020871-1v6dcmt3 2 3 1v6dcmt3 1v6dcmt3 CD cord-020871-1v6dcmt3 2 4 authors author NNS cord-020871-1v6dcmt3 2 5 : : : cord-020871-1v6dcmt3 2 6 Papariello Papariello NNP cord-020871-1v6dcmt3 2 7 , , , cord-020871-1v6dcmt3 2 8 Luca Luca NNP cord-020871-1v6dcmt3 2 9 ; ; : cord-020871-1v6dcmt3 2 10 Bampoulidis Bampoulidis NNP cord-020871-1v6dcmt3 2 11 , , , cord-020871-1v6dcmt3 2 12 Alexandros Alexandros NNP cord-020871-1v6dcmt3 2 13 ; ; : cord-020871-1v6dcmt3 2 14 Lupu Lupu NNP cord-020871-1v6dcmt3 2 15 , , , cord-020871-1v6dcmt3 2 16 Mihai Mihai NNP cord-020871-1v6dcmt3 2 17 title title NN cord-020871-1v6dcmt3 2 18 : : : cord-020871-1v6dcmt3 3 1 On on IN cord-020871-1v6dcmt3 3 2 the the DT cord-020871-1v6dcmt3 3 3 Replicability Replicability NNP cord-020871-1v6dcmt3 3 4 of of IN cord-020871-1v6dcmt3 3 5 Combining combine VBG cord-020871-1v6dcmt3 3 6 Word Word NNP cord-020871-1v6dcmt3 3 7 Embeddings Embeddings NNP cord-020871-1v6dcmt3 3 8 and and CC cord-020871-1v6dcmt3 3 9 Retrieval Retrieval NNP cord-020871-1v6dcmt3 3 10 Models Models NNPS cord-020871-1v6dcmt3 3 11 date date NN cord-020871-1v6dcmt3 3 12 : : : cord-020871-1v6dcmt3 3 13 2020 2020 CD cord-020871-1v6dcmt3 3 14 - - HYPH cord-020871-1v6dcmt3 3 15 03 03 CD cord-020871-1v6dcmt3 3 16 - - HYPH cord-020871-1v6dcmt3 3 17 24 24 CD cord-020871-1v6dcmt3 3 18 journal journal NN cord-020871-1v6dcmt3 3 19 : : : cord-020871-1v6dcmt3 3 20 Advances advance NNS cord-020871-1v6dcmt3 3 21 in in IN cord-020871-1v6dcmt3 3 22 Information Information NNP cord-020871-1v6dcmt3 3 23 Retrieval Retrieval NNP cord-020871-1v6dcmt3 3 24 DOI DOI NNP cord-020871-1v6dcmt3 3 25 : : : cord-020871-1v6dcmt3 3 26 10.1007/978 10.1007/978 CD cord-020871-1v6dcmt3 3 27 - - HYPH cord-020871-1v6dcmt3 3 28 3 3 CD cord-020871-1v6dcmt3 3 29 - - HYPH cord-020871-1v6dcmt3 3 30 030 030 CD cord-020871-1v6dcmt3 3 31 - - HYPH cord-020871-1v6dcmt3 3 32 45442 45442 CD cord-020871-1v6dcmt3 3 33 - - HYPH cord-020871-1v6dcmt3 3 34 5_7 5_7 CD cord-020871-1v6dcmt3 3 35 sha sha NNP cord-020871-1v6dcmt3 3 36 : : : cord-020871-1v6dcmt3 3 37 86513b7187c9f4bfcaf76e989df065eb53a9f871 86513b7187c9f4bfcaf76e989df065eb53a9f871 CD cord-020871-1v6dcmt3 3 38 doc_id doc_id CD cord-020871-1v6dcmt3 3 39 : : : cord-020871-1v6dcmt3 3 40 20871 20871 CD cord-020871-1v6dcmt3 3 41 cord_uid cord_uid NNS cord-020871-1v6dcmt3 3 42 : : : cord-020871-1v6dcmt3 4 1 1v6dcmt3 1v6dcmt3 CD cord-020871-1v6dcmt3 5 1 We -PRON- PRP cord-020871-1v6dcmt3 5 2 replicate replicate VBP cord-020871-1v6dcmt3 5 3 recent recent JJ cord-020871-1v6dcmt3 5 4 experiments experiment NNS cord-020871-1v6dcmt3 5 5 attempting attempt VBG cord-020871-1v6dcmt3 5 6 to to TO cord-020871-1v6dcmt3 5 7 demonstrate demonstrate VB cord-020871-1v6dcmt3 5 8 an an DT cord-020871-1v6dcmt3 5 9 attractive attractive JJ cord-020871-1v6dcmt3 5 10 hypothesis hypothesis NN cord-020871-1v6dcmt3 5 11 about about IN cord-020871-1v6dcmt3 5 12 the the DT cord-020871-1v6dcmt3 5 13 use use NN cord-020871-1v6dcmt3 5 14 of of IN cord-020871-1v6dcmt3 5 15 the the DT cord-020871-1v6dcmt3 5 16 Fisher Fisher NNP cord-020871-1v6dcmt3 5 17 kernel kernel NN cord-020871-1v6dcmt3 5 18 framework framework NN cord-020871-1v6dcmt3 5 19 and and CC cord-020871-1v6dcmt3 5 20 mixture mixture NN cord-020871-1v6dcmt3 5 21 models model NNS cord-020871-1v6dcmt3 5 22 for for IN cord-020871-1v6dcmt3 5 23 aggregating aggregate VBG cord-020871-1v6dcmt3 5 24 word word NN cord-020871-1v6dcmt3 5 25 embeddings embedding NNS cord-020871-1v6dcmt3 5 26 towards towards IN cord-020871-1v6dcmt3 5 27 document document NN cord-020871-1v6dcmt3 5 28 representations representation NNS cord-020871-1v6dcmt3 5 29 and and CC cord-020871-1v6dcmt3 5 30 the the DT cord-020871-1v6dcmt3 5 31 use use NN cord-020871-1v6dcmt3 5 32 of of IN cord-020871-1v6dcmt3 5 33 these these DT cord-020871-1v6dcmt3 5 34 representations representation NNS cord-020871-1v6dcmt3 5 35 in in IN cord-020871-1v6dcmt3 5 36 document document NN cord-020871-1v6dcmt3 5 37 classification classification NN cord-020871-1v6dcmt3 5 38 , , , cord-020871-1v6dcmt3 5 39 clustering clustering NN cord-020871-1v6dcmt3 5 40 , , , cord-020871-1v6dcmt3 5 41 and and CC cord-020871-1v6dcmt3 5 42 retrieval retrieval NN cord-020871-1v6dcmt3 5 43 . . . cord-020871-1v6dcmt3 6 1 Specifically specifically RB cord-020871-1v6dcmt3 6 2 , , , cord-020871-1v6dcmt3 6 3 the the DT cord-020871-1v6dcmt3 6 4 hypothesis hypothesis NN cord-020871-1v6dcmt3 6 5 was be VBD cord-020871-1v6dcmt3 6 6 that that IN cord-020871-1v6dcmt3 6 7 the the DT cord-020871-1v6dcmt3 6 8 use use NN cord-020871-1v6dcmt3 6 9 of of IN cord-020871-1v6dcmt3 6 10 a a DT cord-020871-1v6dcmt3 6 11 mixture mixture NN cord-020871-1v6dcmt3 6 12 model model NN cord-020871-1v6dcmt3 6 13 of of IN cord-020871-1v6dcmt3 6 14 von von NNP cord-020871-1v6dcmt3 6 15 Mises Mises NNP cord-020871-1v6dcmt3 6 16 - - HYPH cord-020871-1v6dcmt3 6 17 Fisher Fisher NNP cord-020871-1v6dcmt3 6 18 ( ( -LRB- cord-020871-1v6dcmt3 6 19 VMF VMF NNP cord-020871-1v6dcmt3 6 20 ) ) -RRB- cord-020871-1v6dcmt3 6 21 distributions distribution NNS cord-020871-1v6dcmt3 6 22 instead instead RB cord-020871-1v6dcmt3 6 23 of of IN cord-020871-1v6dcmt3 6 24 Gaussian gaussian JJ cord-020871-1v6dcmt3 6 25 distributions distribution NNS cord-020871-1v6dcmt3 6 26 would would MD cord-020871-1v6dcmt3 6 27 be be VB cord-020871-1v6dcmt3 6 28 beneficial beneficial JJ cord-020871-1v6dcmt3 6 29 because because IN cord-020871-1v6dcmt3 6 30 of of IN cord-020871-1v6dcmt3 6 31 the the DT cord-020871-1v6dcmt3 6 32 focus focus NN cord-020871-1v6dcmt3 6 33 on on IN cord-020871-1v6dcmt3 6 34 cosine cosine NN cord-020871-1v6dcmt3 6 35 distances distance NNS cord-020871-1v6dcmt3 6 36 of of IN cord-020871-1v6dcmt3 6 37 both both CC cord-020871-1v6dcmt3 6 38 VMF VMF NNP cord-020871-1v6dcmt3 6 39 and and CC cord-020871-1v6dcmt3 6 40 the the DT cord-020871-1v6dcmt3 6 41 vector vector NN cord-020871-1v6dcmt3 6 42 space space NN cord-020871-1v6dcmt3 6 43 model model NN cord-020871-1v6dcmt3 6 44 traditionally traditionally RB cord-020871-1v6dcmt3 6 45 used use VBN cord-020871-1v6dcmt3 6 46 in in IN cord-020871-1v6dcmt3 6 47 information information NN cord-020871-1v6dcmt3 6 48 retrieval retrieval NN cord-020871-1v6dcmt3 6 49 . . . cord-020871-1v6dcmt3 7 1 Previous previous JJ cord-020871-1v6dcmt3 7 2 experiments experiment NNS cord-020871-1v6dcmt3 7 3 had have VBD cord-020871-1v6dcmt3 7 4 validated validate VBN cord-020871-1v6dcmt3 7 5 this this DT cord-020871-1v6dcmt3 7 6 hypothesis hypothesis NN cord-020871-1v6dcmt3 7 7 . . . cord-020871-1v6dcmt3 8 1 Our -PRON- PRP$ cord-020871-1v6dcmt3 8 2 replication replication NN cord-020871-1v6dcmt3 8 3 was be VBD cord-020871-1v6dcmt3 8 4 not not RB cord-020871-1v6dcmt3 8 5 able able JJ cord-020871-1v6dcmt3 8 6 to to TO cord-020871-1v6dcmt3 8 7 validate validate VB cord-020871-1v6dcmt3 8 8 it -PRON- PRP cord-020871-1v6dcmt3 8 9 , , , cord-020871-1v6dcmt3 8 10 despite despite IN cord-020871-1v6dcmt3 8 11 a a DT cord-020871-1v6dcmt3 8 12 large large JJ cord-020871-1v6dcmt3 8 13 parameter parameter NN cord-020871-1v6dcmt3 8 14 scan scan JJ cord-020871-1v6dcmt3 8 15 space space NN cord-020871-1v6dcmt3 8 16 . . . cord-020871-1v6dcmt3 9 1 The the DT cord-020871-1v6dcmt3 9 2 last last JJ cord-020871-1v6dcmt3 9 3 5 5 CD cord-020871-1v6dcmt3 9 4 years year NNS cord-020871-1v6dcmt3 9 5 have have VBP cord-020871-1v6dcmt3 9 6 seen see VBN cord-020871-1v6dcmt3 9 7 proof proof NN cord-020871-1v6dcmt3 9 8 that that IN cord-020871-1v6dcmt3 9 9 neural neural JJ cord-020871-1v6dcmt3 9 10 network network NN cord-020871-1v6dcmt3 9 11 - - HYPH cord-020871-1v6dcmt3 9 12 based base VBN cord-020871-1v6dcmt3 9 13 word word NN cord-020871-1v6dcmt3 9 14 embedding embedding NN cord-020871-1v6dcmt3 9 15 models model NNS cord-020871-1v6dcmt3 9 16 provide provide VBP cord-020871-1v6dcmt3 9 17 term term NN cord-020871-1v6dcmt3 9 18 representations representation NNS cord-020871-1v6dcmt3 9 19 that that WDT cord-020871-1v6dcmt3 9 20 are be VBP cord-020871-1v6dcmt3 9 21 a a DT cord-020871-1v6dcmt3 9 22 useful useful JJ cord-020871-1v6dcmt3 9 23 information information NN cord-020871-1v6dcmt3 9 24 source source NN cord-020871-1v6dcmt3 9 25 for for IN cord-020871-1v6dcmt3 9 26 a a DT cord-020871-1v6dcmt3 9 27 variety variety NN cord-020871-1v6dcmt3 9 28 of of IN cord-020871-1v6dcmt3 9 29 tasks task NNS cord-020871-1v6dcmt3 9 30 in in IN cord-020871-1v6dcmt3 9 31 natural natural JJ cord-020871-1v6dcmt3 9 32 language language NN cord-020871-1v6dcmt3 9 33 processing processing NN cord-020871-1v6dcmt3 9 34 . . . cord-020871-1v6dcmt3 10 1 In in IN cord-020871-1v6dcmt3 10 2 information information NN cord-020871-1v6dcmt3 10 3 retrieval retrieval NN cord-020871-1v6dcmt3 10 4 ( ( -LRB- cord-020871-1v6dcmt3 10 5 IR IR NNP cord-020871-1v6dcmt3 10 6 ) ) -RRB- cord-020871-1v6dcmt3 10 7 , , , cord-020871-1v6dcmt3 10 8 " " `` cord-020871-1v6dcmt3 10 9 traditional traditional JJ cord-020871-1v6dcmt3 10 10 " " '' cord-020871-1v6dcmt3 10 11 models model NNS cord-020871-1v6dcmt3 10 12 remain remain VBP cord-020871-1v6dcmt3 10 13 a a DT cord-020871-1v6dcmt3 10 14 high high JJ cord-020871-1v6dcmt3 10 15 baseline baseline NN cord-020871-1v6dcmt3 10 16 to to TO cord-020871-1v6dcmt3 10 17 beat beat VB cord-020871-1v6dcmt3 10 18 , , , cord-020871-1v6dcmt3 10 19 particularly particularly RB cord-020871-1v6dcmt3 10 20 when when WRB cord-020871-1v6dcmt3 10 21 considering consider VBG cord-020871-1v6dcmt3 10 22 efficiency efficiency NN cord-020871-1v6dcmt3 10 23 in in IN cord-020871-1v6dcmt3 10 24 addition addition NN cord-020871-1v6dcmt3 10 25 to to TO cord-020871-1v6dcmt3 10 26 effectiveness effectiveness VB cord-020871-1v6dcmt3 10 27 [ [ -LRB- cord-020871-1v6dcmt3 10 28 6 6 CD cord-020871-1v6dcmt3 10 29 ] ] -RRB- cord-020871-1v6dcmt3 10 30 . . . cord-020871-1v6dcmt3 11 1 Combining combine VBG cord-020871-1v6dcmt3 11 2 the the DT cord-020871-1v6dcmt3 11 3 word word NN cord-020871-1v6dcmt3 11 4 embedding embed VBG cord-020871-1v6dcmt3 11 5 models model NNS cord-020871-1v6dcmt3 11 6 with with IN cord-020871-1v6dcmt3 11 7 the the DT cord-020871-1v6dcmt3 11 8 traditional traditional JJ cord-020871-1v6dcmt3 11 9 IR IR NNP cord-020871-1v6dcmt3 11 10 models model NNS cord-020871-1v6dcmt3 11 11 is be VBZ cord-020871-1v6dcmt3 11 12 therefore therefore RB cord-020871-1v6dcmt3 11 13 very very RB cord-020871-1v6dcmt3 11 14 attractive attractive JJ cord-020871-1v6dcmt3 11 15 and and CC cord-020871-1v6dcmt3 11 16 several several JJ cord-020871-1v6dcmt3 11 17 papers paper NNS cord-020871-1v6dcmt3 11 18 have have VBP cord-020871-1v6dcmt3 11 19 attempted attempt VBN cord-020871-1v6dcmt3 11 20 to to TO cord-020871-1v6dcmt3 11 21 improve improve VB cord-020871-1v6dcmt3 11 22 the the DT cord-020871-1v6dcmt3 11 23 baseline baseline NN cord-020871-1v6dcmt3 11 24 by by IN cord-020871-1v6dcmt3 11 25 adding add VBG cord-020871-1v6dcmt3 11 26 in in RP cord-020871-1v6dcmt3 11 27 , , , cord-020871-1v6dcmt3 11 28 in in IN cord-020871-1v6dcmt3 11 29 a a DT cord-020871-1v6dcmt3 11 30 more more RBR cord-020871-1v6dcmt3 11 31 or or CC cord-020871-1v6dcmt3 11 32 less less JJR cord-020871-1v6dcmt3 11 33 ad ad FW cord-020871-1v6dcmt3 11 34 - - HYPH cord-020871-1v6dcmt3 11 35 hoc hoc FW cord-020871-1v6dcmt3 11 36 fashion fashion NN cord-020871-1v6dcmt3 11 37 , , , cord-020871-1v6dcmt3 11 38 word word NN cord-020871-1v6dcmt3 11 39 - - HYPH cord-020871-1v6dcmt3 11 40 embedding embed VBG cord-020871-1v6dcmt3 11 41 information information NN cord-020871-1v6dcmt3 11 42 . . . cord-020871-1v6dcmt3 12 1 Onal Onal NNP cord-020871-1v6dcmt3 12 2 et et NNP cord-020871-1v6dcmt3 12 3 al al NNP cord-020871-1v6dcmt3 12 4 . . . cord-020871-1v6dcmt3 13 1 [ [ -LRB- cord-020871-1v6dcmt3 13 2 10 10 CD cord-020871-1v6dcmt3 13 3 ] ] -RRB- cord-020871-1v6dcmt3 13 4 summarized summarize VBD cord-020871-1v6dcmt3 13 5 the the DT cord-020871-1v6dcmt3 13 6 various various JJ cord-020871-1v6dcmt3 13 7 developments development NNS cord-020871-1v6dcmt3 13 8 of of IN cord-020871-1v6dcmt3 13 9 the the DT cord-020871-1v6dcmt3 13 10 last last JJ cord-020871-1v6dcmt3 13 11 half half JJ cord-020871-1v6dcmt3 13 12 - - HYPH cord-020871-1v6dcmt3 13 13 decade decade NN cord-020871-1v6dcmt3 13 14 in in IN cord-020871-1v6dcmt3 13 15 the the DT cord-020871-1v6dcmt3 13 16 field field NN cord-020871-1v6dcmt3 13 17 of of IN cord-020871-1v6dcmt3 13 18 neural neural JJ cord-020871-1v6dcmt3 13 19 IR IR NNP cord-020871-1v6dcmt3 13 20 and and CC cord-020871-1v6dcmt3 13 21 group group NN cord-020871-1v6dcmt3 13 22 the the DT cord-020871-1v6dcmt3 13 23 methods method NNS cord-020871-1v6dcmt3 13 24 in in IN cord-020871-1v6dcmt3 13 25 two two CD cord-020871-1v6dcmt3 13 26 categories category NNS cord-020871-1v6dcmt3 13 27 : : : cord-020871-1v6dcmt3 13 28 aggregate aggregate JJ cord-020871-1v6dcmt3 13 29 and and CC cord-020871-1v6dcmt3 13 30 learn learn VB cord-020871-1v6dcmt3 13 31 . . . cord-020871-1v6dcmt3 14 1 The the DT cord-020871-1v6dcmt3 14 2 first first JJ cord-020871-1v6dcmt3 14 3 one one NN cord-020871-1v6dcmt3 14 4 , , , cord-020871-1v6dcmt3 14 5 also also RB cord-020871-1v6dcmt3 14 6 known know VBN cord-020871-1v6dcmt3 14 7 as as IN cord-020871-1v6dcmt3 14 8 compositional compositional JJ cord-020871-1v6dcmt3 14 9 distributional distributional JJ cord-020871-1v6dcmt3 14 10 semantics semantic NNS cord-020871-1v6dcmt3 14 11 , , , cord-020871-1v6dcmt3 14 12 starts start VBZ cord-020871-1v6dcmt3 14 13 from from IN cord-020871-1v6dcmt3 14 14 term term NN cord-020871-1v6dcmt3 14 15 representations representation NNS cord-020871-1v6dcmt3 14 16 and and CC cord-020871-1v6dcmt3 14 17 uses use VBZ cord-020871-1v6dcmt3 14 18 some some DT cord-020871-1v6dcmt3 14 19 function function NN cord-020871-1v6dcmt3 14 20 to to TO cord-020871-1v6dcmt3 14 21 combine combine VB cord-020871-1v6dcmt3 14 22 them -PRON- PRP cord-020871-1v6dcmt3 14 23 into into IN cord-020871-1v6dcmt3 14 24 a a DT cord-020871-1v6dcmt3 14 25 document document NN cord-020871-1v6dcmt3 14 26 representation representation NN cord-020871-1v6dcmt3 14 27 ( ( -LRB- cord-020871-1v6dcmt3 14 28 a a DT cord-020871-1v6dcmt3 14 29 simple simple JJ cord-020871-1v6dcmt3 14 30 example example NN cord-020871-1v6dcmt3 14 31 is be VBZ cord-020871-1v6dcmt3 14 32 a a DT cord-020871-1v6dcmt3 14 33 weighted weighted JJ cord-020871-1v6dcmt3 14 34 sum sum NN cord-020871-1v6dcmt3 14 35 ) ) -RRB- cord-020871-1v6dcmt3 14 36 . . . cord-020871-1v6dcmt3 15 1 The the DT cord-020871-1v6dcmt3 15 2 second second JJ cord-020871-1v6dcmt3 15 3 method method NN cord-020871-1v6dcmt3 15 4 uses use VBZ cord-020871-1v6dcmt3 15 5 the the DT cord-020871-1v6dcmt3 15 6 word word NN cord-020871-1v6dcmt3 15 7 embedding embed VBG cord-020871-1v6dcmt3 15 8 as as IN cord-020871-1v6dcmt3 15 9 a a DT cord-020871-1v6dcmt3 15 10 first first JJ cord-020871-1v6dcmt3 15 11 layer layer NN cord-020871-1v6dcmt3 15 12 of of IN cord-020871-1v6dcmt3 15 13 another another DT cord-020871-1v6dcmt3 15 14 neural neural JJ cord-020871-1v6dcmt3 15 15 network network NN cord-020871-1v6dcmt3 15 16 to to TO cord-020871-1v6dcmt3 15 17 output output VB cord-020871-1v6dcmt3 15 18 a a DT cord-020871-1v6dcmt3 15 19 document document NN cord-020871-1v6dcmt3 15 20 representation representation NN cord-020871-1v6dcmt3 15 21 . . . cord-020871-1v6dcmt3 16 1 The the DT cord-020871-1v6dcmt3 16 2 advantage advantage NN cord-020871-1v6dcmt3 16 3 of of IN cord-020871-1v6dcmt3 16 4 the the DT cord-020871-1v6dcmt3 16 5 first first JJ cord-020871-1v6dcmt3 16 6 type type NN cord-020871-1v6dcmt3 16 7 of of IN cord-020871-1v6dcmt3 16 8 methods method NNS cord-020871-1v6dcmt3 16 9 is be VBZ cord-020871-1v6dcmt3 16 10 that that IN cord-020871-1v6dcmt3 16 11 they -PRON- PRP cord-020871-1v6dcmt3 16 12 often often RB cord-020871-1v6dcmt3 16 13 distill distill VBP cord-020871-1v6dcmt3 16 14 down down RP cord-020871-1v6dcmt3 16 15 to to IN cord-020871-1v6dcmt3 16 16 a a DT cord-020871-1v6dcmt3 16 17 linear linear JJ cord-020871-1v6dcmt3 16 18 combination combination NN cord-020871-1v6dcmt3 16 19 ( ( -LRB- cord-020871-1v6dcmt3 16 20 perhaps perhaps RB cord-020871-1v6dcmt3 16 21 via via IN cord-020871-1v6dcmt3 16 22 a a DT cord-020871-1v6dcmt3 16 23 kernel kernel NN cord-020871-1v6dcmt3 16 24 ) ) -RRB- cord-020871-1v6dcmt3 16 25 , , , cord-020871-1v6dcmt3 16 26 from from IN cord-020871-1v6dcmt3 16 27 which which WDT cord-020871-1v6dcmt3 16 28 an an DT cord-020871-1v6dcmt3 16 29 explanation explanation NN cord-020871-1v6dcmt3 16 30 about about IN cord-020871-1v6dcmt3 16 31 the the DT cord-020871-1v6dcmt3 16 32 representation representation NN cord-020871-1v6dcmt3 16 33 of of IN cord-020871-1v6dcmt3 16 34 the the DT cord-020871-1v6dcmt3 16 35 document document NN cord-020871-1v6dcmt3 16 36 is be VBZ cord-020871-1v6dcmt3 16 37 easier easy JJR cord-020871-1v6dcmt3 16 38 to to TO cord-020871-1v6dcmt3 16 39 induce induce VB cord-020871-1v6dcmt3 16 40 than than IN cord-020871-1v6dcmt3 16 41 from from IN cord-020871-1v6dcmt3 16 42 the the DT cord-020871-1v6dcmt3 16 43 neural neural JJ cord-020871-1v6dcmt3 16 44 network network NN cord-020871-1v6dcmt3 16 45 layers layer NNS cord-020871-1v6dcmt3 16 46 built build VBN cord-020871-1v6dcmt3 16 47 on on IN cord-020871-1v6dcmt3 16 48 top top NN cord-020871-1v6dcmt3 16 49 of of IN cord-020871-1v6dcmt3 16 50 a a DT cord-020871-1v6dcmt3 16 51 word word NN cord-020871-1v6dcmt3 16 52 embedding embedding NN cord-020871-1v6dcmt3 16 53 . . . cord-020871-1v6dcmt3 17 1 Recently recently RB cord-020871-1v6dcmt3 17 2 , , , cord-020871-1v6dcmt3 17 3 the the DT cord-020871-1v6dcmt3 17 4 issue issue NN cord-020871-1v6dcmt3 17 5 of of IN cord-020871-1v6dcmt3 17 6 explainability explainability NN cord-020871-1v6dcmt3 17 7 in in IN cord-020871-1v6dcmt3 17 8 IR IR NNP cord-020871-1v6dcmt3 17 9 and and CC cord-020871-1v6dcmt3 17 10 recommendation recommendation NN cord-020871-1v6dcmt3 17 11 is be VBZ cord-020871-1v6dcmt3 17 12 generating generate VBG cord-020871-1v6dcmt3 17 13 a a DT cord-020871-1v6dcmt3 17 14 renewed renew VBN cord-020871-1v6dcmt3 17 15 interest interest NN cord-020871-1v6dcmt3 17 16 [ [ -LRB- cord-020871-1v6dcmt3 17 17 15 15 CD cord-020871-1v6dcmt3 17 18 ] ] -RRB- cord-020871-1v6dcmt3 17 19 . . . cord-020871-1v6dcmt3 18 1 In in IN cord-020871-1v6dcmt3 18 2 this this DT cord-020871-1v6dcmt3 18 3 sense sense NN cord-020871-1v6dcmt3 18 4 , , , cord-020871-1v6dcmt3 18 5 Zhang Zhang NNP cord-020871-1v6dcmt3 18 6 et et NNP cord-020871-1v6dcmt3 18 7 al al NNP cord-020871-1v6dcmt3 18 8 . . . cord-020871-1v6dcmt3 19 1 [ [ -LRB- cord-020871-1v6dcmt3 19 2 14 14 CD cord-020871-1v6dcmt3 19 3 ] ] -RRB- cord-020871-1v6dcmt3 19 4 introduced introduce VBD cord-020871-1v6dcmt3 19 5 a a DT cord-020871-1v6dcmt3 19 6 new new JJ cord-020871-1v6dcmt3 19 7 model model NN cord-020871-1v6dcmt3 19 8 for for IN cord-020871-1v6dcmt3 19 9 combining combine VBG cord-020871-1v6dcmt3 19 10 highdimensional highdimensional JJ cord-020871-1v6dcmt3 19 11 vectors vector NNS cord-020871-1v6dcmt3 19 12 , , , cord-020871-1v6dcmt3 19 13 using use VBG cord-020871-1v6dcmt3 19 14 a a DT cord-020871-1v6dcmt3 19 15 mixture mixture NN cord-020871-1v6dcmt3 19 16 model model NN cord-020871-1v6dcmt3 19 17 of of IN cord-020871-1v6dcmt3 19 18 von von NNP cord-020871-1v6dcmt3 19 19 Mises Mises NNP cord-020871-1v6dcmt3 19 20 - - HYPH cord-020871-1v6dcmt3 19 21 Fisher Fisher NNP cord-020871-1v6dcmt3 19 22 ( ( -LRB- cord-020871-1v6dcmt3 19 23 VMF VMF NNP cord-020871-1v6dcmt3 19 24 ) ) -RRB- cord-020871-1v6dcmt3 20 1 instead instead RB cord-020871-1v6dcmt3 20 2 of of IN cord-020871-1v6dcmt3 20 3 Gaussian gaussian JJ cord-020871-1v6dcmt3 20 4 distributions distribution NNS cord-020871-1v6dcmt3 20 5 previously previously RB cord-020871-1v6dcmt3 20 6 suggested suggest VBN cord-020871-1v6dcmt3 20 7 by by IN cord-020871-1v6dcmt3 20 8 Clinchant Clinchant NNP cord-020871-1v6dcmt3 20 9 and and CC cord-020871-1v6dcmt3 20 10 Perronnin Perronnin NNP cord-020871-1v6dcmt3 20 11 [ [ -LRB- cord-020871-1v6dcmt3 20 12 3 3 CD cord-020871-1v6dcmt3 20 13 ] ] -RRB- cord-020871-1v6dcmt3 20 14 . . . cord-020871-1v6dcmt3 21 1 This this DT cord-020871-1v6dcmt3 21 2 is be VBZ cord-020871-1v6dcmt3 21 3 an an DT cord-020871-1v6dcmt3 21 4 attractive attractive JJ cord-020871-1v6dcmt3 21 5 hypothesis hypothesis NN cord-020871-1v6dcmt3 21 6 because because IN cord-020871-1v6dcmt3 21 7 the the DT cord-020871-1v6dcmt3 21 8 Gaussian Gaussian NNP cord-020871-1v6dcmt3 21 9 Mixture Mixture NNP cord-020871-1v6dcmt3 21 10 Model Model NNP cord-020871-1v6dcmt3 21 11 ( ( -LRB- cord-020871-1v6dcmt3 21 12 GMM GMM NNP cord-020871-1v6dcmt3 21 13 ) ) -RRB- cord-020871-1v6dcmt3 21 14 works work VBZ cord-020871-1v6dcmt3 21 15 on on IN cord-020871-1v6dcmt3 21 16 Euclidean euclidean JJ cord-020871-1v6dcmt3 21 17 distance distance NN cord-020871-1v6dcmt3 21 18 , , , cord-020871-1v6dcmt3 21 19 while while IN cord-020871-1v6dcmt3 21 20 the the DT cord-020871-1v6dcmt3 21 21 mixture mixture NN cord-020871-1v6dcmt3 21 22 of of IN cord-020871-1v6dcmt3 21 23 von von NNP cord-020871-1v6dcmt3 21 24 Mises Mises NNP cord-020871-1v6dcmt3 21 25 - - HYPH cord-020871-1v6dcmt3 21 26 Fisher Fisher NNP cord-020871-1v6dcmt3 21 27 ( ( -LRB- cord-020871-1v6dcmt3 21 28 moVMF movmf NN cord-020871-1v6dcmt3 21 29 ) ) -RRB- cord-020871-1v6dcmt3 21 30 model model NN cord-020871-1v6dcmt3 21 31 works work VBZ cord-020871-1v6dcmt3 21 32 on on IN cord-020871-1v6dcmt3 21 33 cosine cosine NN cord-020871-1v6dcmt3 21 34 distances distance NNS cord-020871-1v6dcmt3 21 35 - - : cord-020871-1v6dcmt3 21 36 the the DT cord-020871-1v6dcmt3 21 37 typical typical JJ cord-020871-1v6dcmt3 21 38 distance distance NN cord-020871-1v6dcmt3 21 39 function function NN cord-020871-1v6dcmt3 21 40 in in IN cord-020871-1v6dcmt3 21 41 IR IR NNP cord-020871-1v6dcmt3 21 42 . . . cord-020871-1v6dcmt3 22 1 In in IN cord-020871-1v6dcmt3 22 2 the the DT cord-020871-1v6dcmt3 22 3 following following JJ cord-020871-1v6dcmt3 22 4 sections section NNS cord-020871-1v6dcmt3 22 5 , , , cord-020871-1v6dcmt3 22 6 we -PRON- PRP cord-020871-1v6dcmt3 22 7 set set VBD cord-020871-1v6dcmt3 22 8 up up RP cord-020871-1v6dcmt3 22 9 to to TO cord-020871-1v6dcmt3 22 10 replicate replicate VB cord-020871-1v6dcmt3 22 11 the the DT cord-020871-1v6dcmt3 22 12 experiments experiment NNS cord-020871-1v6dcmt3 22 13 described describe VBN cord-020871-1v6dcmt3 22 14 by by IN cord-020871-1v6dcmt3 22 15 Zhang Zhang NNP cord-020871-1v6dcmt3 22 16 et et NNP cord-020871-1v6dcmt3 22 17 al al NNP cord-020871-1v6dcmt3 22 18 . . . cord-020871-1v6dcmt3 23 1 [ [ -LRB- cord-020871-1v6dcmt3 23 2 14 14 CD cord-020871-1v6dcmt3 23 3 ] ] -RRB- cord-020871-1v6dcmt3 23 4 . . . cord-020871-1v6dcmt3 24 1 They -PRON- PRP cord-020871-1v6dcmt3 24 2 are be VBP cord-020871-1v6dcmt3 24 3 grouped group VBN cord-020871-1v6dcmt3 24 4 in in IN cord-020871-1v6dcmt3 24 5 three three CD cord-020871-1v6dcmt3 24 6 sets set NNS cord-020871-1v6dcmt3 24 7 : : : cord-020871-1v6dcmt3 24 8 classification classification NN cord-020871-1v6dcmt3 24 9 , , , cord-020871-1v6dcmt3 24 10 clustering clustering NN cord-020871-1v6dcmt3 24 11 , , , cord-020871-1v6dcmt3 24 12 and and CC cord-020871-1v6dcmt3 24 13 information information NN cord-020871-1v6dcmt3 24 14 retrieval retrieval NN cord-020871-1v6dcmt3 24 15 , , , cord-020871-1v6dcmt3 24 16 and and CC cord-020871-1v6dcmt3 24 17 compare compare VB cord-020871-1v6dcmt3 24 18 " " `` cord-020871-1v6dcmt3 24 19 standard standard JJ cord-020871-1v6dcmt3 24 20 " " '' cord-020871-1v6dcmt3 24 21 embedding embed VBG cord-020871-1v6dcmt3 24 22 methods method NNS cord-020871-1v6dcmt3 24 23 with with IN cord-020871-1v6dcmt3 24 24 the the DT cord-020871-1v6dcmt3 24 25 novel novel NN cord-020871-1v6dcmt3 24 26 moVMF movmf NN cord-020871-1v6dcmt3 24 27 representation representation NN cord-020871-1v6dcmt3 24 28 . . . cord-020871-1v6dcmt3 25 1 In in IN cord-020871-1v6dcmt3 25 2 general general JJ cord-020871-1v6dcmt3 25 3 , , , cord-020871-1v6dcmt3 25 4 we -PRON- PRP cord-020871-1v6dcmt3 25 5 follow follow VBP cord-020871-1v6dcmt3 25 6 the the DT cord-020871-1v6dcmt3 25 7 experimental experimental JJ cord-020871-1v6dcmt3 25 8 setup setup NN cord-020871-1v6dcmt3 25 9 of of IN cord-020871-1v6dcmt3 25 10 the the DT cord-020871-1v6dcmt3 25 11 original original JJ cord-020871-1v6dcmt3 25 12 paper paper NN cord-020871-1v6dcmt3 25 13 and and CC cord-020871-1v6dcmt3 25 14 , , , cord-020871-1v6dcmt3 25 15 for for IN cord-020871-1v6dcmt3 25 16 lack lack NN cord-020871-1v6dcmt3 25 17 of of IN cord-020871-1v6dcmt3 25 18 space space NN cord-020871-1v6dcmt3 25 19 , , , cord-020871-1v6dcmt3 25 20 we -PRON- PRP cord-020871-1v6dcmt3 25 21 do do VBP cord-020871-1v6dcmt3 25 22 not not RB cord-020871-1v6dcmt3 25 23 repeat repeat VB cord-020871-1v6dcmt3 25 24 here here RB cord-020871-1v6dcmt3 25 25 many many JJ cord-020871-1v6dcmt3 25 26 details detail NNS cord-020871-1v6dcmt3 25 27 , , , cord-020871-1v6dcmt3 25 28 if if IN cord-020871-1v6dcmt3 25 29 they -PRON- PRP cord-020871-1v6dcmt3 25 30 are be VBP cord-020871-1v6dcmt3 25 31 clearly clearly RB cord-020871-1v6dcmt3 25 32 explained explain VBN cord-020871-1v6dcmt3 25 33 there there RB cord-020871-1v6dcmt3 25 34 . . . cord-020871-1v6dcmt3 26 1 All all DT cord-020871-1v6dcmt3 26 2 experiments experiment NNS cord-020871-1v6dcmt3 26 3 are be VBP cord-020871-1v6dcmt3 26 4 conducted conduct VBN cord-020871-1v6dcmt3 26 5 on on IN cord-020871-1v6dcmt3 26 6 publicly publicly RB cord-020871-1v6dcmt3 26 7 available available JJ cord-020871-1v6dcmt3 26 8 datasets dataset NNS cord-020871-1v6dcmt3 26 9 and and CC cord-020871-1v6dcmt3 26 10 are be VBP cord-020871-1v6dcmt3 26 11 briefly briefly RB cord-020871-1v6dcmt3 26 12 described describe VBN cord-020871-1v6dcmt3 26 13 here here RB cord-020871-1v6dcmt3 26 14 below below RB cord-020871-1v6dcmt3 26 15 . . . cord-020871-1v6dcmt3 27 1 Classification classification NN cord-020871-1v6dcmt3 27 2 . . . cord-020871-1v6dcmt3 28 1 Two two CD cord-020871-1v6dcmt3 28 2 subsets subset NNS cord-020871-1v6dcmt3 28 3 of of IN cord-020871-1v6dcmt3 28 4 the the DT cord-020871-1v6dcmt3 28 5 movie movie NN cord-020871-1v6dcmt3 28 6 review review NN cord-020871-1v6dcmt3 28 7 dataset dataset NN cord-020871-1v6dcmt3 28 8 : : : cord-020871-1v6dcmt3 28 9 ( ( -LRB- cord-020871-1v6dcmt3 28 10 i i NN cord-020871-1v6dcmt3 28 11 ) ) -RRB- cord-020871-1v6dcmt3 28 12 the the DT cord-020871-1v6dcmt3 28 13 subjectivity subjectivity NN cord-020871-1v6dcmt3 28 14 dataset dataset NN cord-020871-1v6dcmt3 28 15 ( ( -LRB- cord-020871-1v6dcmt3 28 16 subj subj NNP cord-020871-1v6dcmt3 28 17 ) ) -RRB- cord-020871-1v6dcmt3 28 18 [ [ -LRB- cord-020871-1v6dcmt3 28 19 11 11 CD cord-020871-1v6dcmt3 28 20 ] ] -RRB- cord-020871-1v6dcmt3 28 21 ; ; : cord-020871-1v6dcmt3 28 22 and and CC cord-020871-1v6dcmt3 28 23 ( ( -LRB- cord-020871-1v6dcmt3 28 24 ii ii NN cord-020871-1v6dcmt3 28 25 ) ) -RRB- cord-020871-1v6dcmt3 28 26 the the DT cord-020871-1v6dcmt3 28 27 sentence sentence NN cord-020871-1v6dcmt3 28 28 polarity polarity NN cord-020871-1v6dcmt3 28 29 dataset dataset NN cord-020871-1v6dcmt3 28 30 ( ( -LRB- cord-020871-1v6dcmt3 28 31 sent send VBN cord-020871-1v6dcmt3 28 32 ) ) -RRB- cord-020871-1v6dcmt3 28 33 [ [ -LRB- cord-020871-1v6dcmt3 28 34 12 12 CD cord-020871-1v6dcmt3 28 35 ] ] -RRB- cord-020871-1v6dcmt3 28 36 . . . cord-020871-1v6dcmt3 29 1 Clustering cluster VBG cord-020871-1v6dcmt3 29 2 . . . cord-020871-1v6dcmt3 30 1 The the DT cord-020871-1v6dcmt3 30 2 20 20 CD cord-020871-1v6dcmt3 30 3 Newsgroups newsgroup NNS cord-020871-1v6dcmt3 30 4 dataset dataset NN cord-020871-1v6dcmt3 30 5 1 1 CD cord-020871-1v6dcmt3 30 6 was be VBD cord-020871-1v6dcmt3 30 7 used use VBN cord-020871-1v6dcmt3 30 8 in in IN cord-020871-1v6dcmt3 30 9 the the DT cord-020871-1v6dcmt3 30 10 original original JJ cord-020871-1v6dcmt3 30 11 paper paper NN cord-020871-1v6dcmt3 30 12 , , , cord-020871-1v6dcmt3 30 13 but but CC cord-020871-1v6dcmt3 30 14 the the DT cord-020871-1v6dcmt3 30 15 concrete concrete JJ cord-020871-1v6dcmt3 30 16 version version NN cord-020871-1v6dcmt3 30 17 was be VBD cord-020871-1v6dcmt3 30 18 not not RB cord-020871-1v6dcmt3 30 19 specified specify VBN cord-020871-1v6dcmt3 30 20 . . . cord-020871-1v6dcmt3 31 1 We -PRON- PRP cord-020871-1v6dcmt3 31 2 selected select VBD cord-020871-1v6dcmt3 31 3 the the DT cord-020871-1v6dcmt3 31 4 " " `` cord-020871-1v6dcmt3 31 5 bydate bydate NNP cord-020871-1v6dcmt3 31 6 " " '' cord-020871-1v6dcmt3 31 7 version version NN cord-020871-1v6dcmt3 31 8 , , , cord-020871-1v6dcmt3 31 9 because because IN cord-020871-1v6dcmt3 31 10 it -PRON- PRP cord-020871-1v6dcmt3 31 11 is be VBZ cord-020871-1v6dcmt3 31 12 , , , cord-020871-1v6dcmt3 31 13 according accord VBG cord-020871-1v6dcmt3 31 14 to to IN cord-020871-1v6dcmt3 31 15 its -PRON- PRP$ cord-020871-1v6dcmt3 31 16 creators creator NNS cord-020871-1v6dcmt3 31 17 , , , cord-020871-1v6dcmt3 31 18 the the DT cord-020871-1v6dcmt3 31 19 most most RBS cord-020871-1v6dcmt3 31 20 commonly commonly RB cord-020871-1v6dcmt3 31 21 used use VBN cord-020871-1v6dcmt3 31 22 in in IN cord-020871-1v6dcmt3 31 23 the the DT cord-020871-1v6dcmt3 31 24 literature literature NN cord-020871-1v6dcmt3 31 25 . . . cord-020871-1v6dcmt3 32 1 It -PRON- PRP cord-020871-1v6dcmt3 32 2 is be VBZ cord-020871-1v6dcmt3 32 3 also also RB cord-020871-1v6dcmt3 32 4 the the DT cord-020871-1v6dcmt3 32 5 version version NN cord-020871-1v6dcmt3 32 6 directly directly RB cord-020871-1v6dcmt3 32 7 load load NN cord-020871-1v6dcmt3 32 8 - - HYPH cord-020871-1v6dcmt3 32 9 able able JJ cord-020871-1v6dcmt3 32 10 in in IN cord-020871-1v6dcmt3 32 11 scikit scikit NNS cord-020871-1v6dcmt3 32 12 - - : cord-020871-1v6dcmt3 32 13 learn learn VB cord-020871-1v6dcmt3 32 14 2 2 CD cord-020871-1v6dcmt3 32 15 , , , cord-020871-1v6dcmt3 32 16 making make VBG cord-020871-1v6dcmt3 32 17 it -PRON- PRP cord-020871-1v6dcmt3 32 18 therefore therefore RB cord-020871-1v6dcmt3 32 19 more more RBR cord-020871-1v6dcmt3 32 20 likely likely JJ cord-020871-1v6dcmt3 32 21 that that IN cord-020871-1v6dcmt3 32 22 the the DT cord-020871-1v6dcmt3 32 23 authors author NNS cord-020871-1v6dcmt3 32 24 had have VBD cord-020871-1v6dcmt3 32 25 used use VBN cord-020871-1v6dcmt3 32 26 this this DT cord-020871-1v6dcmt3 32 27 version version NN cord-020871-1v6dcmt3 32 28 . . . cord-020871-1v6dcmt3 33 1 Retrieval Retrieval NNP cord-020871-1v6dcmt3 33 2 . . . cord-020871-1v6dcmt3 34 1 The the DT cord-020871-1v6dcmt3 34 2 TREC TREC NNP cord-020871-1v6dcmt3 34 3 Robust04 Robust04 NNP cord-020871-1v6dcmt3 34 4 collection collection NN cord-020871-1v6dcmt3 34 5 [ [ -LRB- cord-020871-1v6dcmt3 34 6 13 13 CD cord-020871-1v6dcmt3 34 7 ] ] -RRB- cord-020871-1v6dcmt3 34 8 . . . cord-020871-1v6dcmt3 35 1 The the DT cord-020871-1v6dcmt3 35 2 methods method NNS cord-020871-1v6dcmt3 35 3 used use VBN cord-020871-1v6dcmt3 35 4 to to TO cord-020871-1v6dcmt3 35 5 generate generate VB cord-020871-1v6dcmt3 35 6 vectors vector NNS cord-020871-1v6dcmt3 35 7 for for IN cord-020871-1v6dcmt3 35 8 terms term NNS cord-020871-1v6dcmt3 35 9 and and CC cord-020871-1v6dcmt3 35 10 documents document NNS cord-020871-1v6dcmt3 35 11 are be VBP cord-020871-1v6dcmt3 35 12 : : : cord-020871-1v6dcmt3 35 13 TF TF NNP cord-020871-1v6dcmt3 35 14 - - HYPH cord-020871-1v6dcmt3 35 15 IDF IDF NNP cord-020871-1v6dcmt3 35 16 . . . cord-020871-1v6dcmt3 36 1 The the DT cord-020871-1v6dcmt3 36 2 basic basic JJ cord-020871-1v6dcmt3 36 3 term term NN cord-020871-1v6dcmt3 36 4 frequency frequency NN cord-020871-1v6dcmt3 36 5 -inverse -inverse HYPH cord-020871-1v6dcmt3 36 6 document document NN cord-020871-1v6dcmt3 36 7 frequency frequency NN cord-020871-1v6dcmt3 36 8 method method NN cord-020871-1v6dcmt3 36 9 [ [ -LRB- cord-020871-1v6dcmt3 36 10 5 5 CD cord-020871-1v6dcmt3 36 11 ] ] -RRB- cord-020871-1v6dcmt3 36 12 . . . cord-020871-1v6dcmt3 37 1 Implemented implement VBN cord-020871-1v6dcmt3 37 2 in in IN cord-020871-1v6dcmt3 37 3 the the DT cord-020871-1v6dcmt3 37 4 scikit scikit NNS cord-020871-1v6dcmt3 37 5 - - HYPH cord-020871-1v6dcmt3 37 6 learn learn VB cord-020871-1v6dcmt3 37 7 library library NN cord-020871-1v6dcmt3 37 8 3 3 CD cord-020871-1v6dcmt3 37 9 . . . cord-020871-1v6dcmt3 38 1 [ [ -LRB- cord-020871-1v6dcmt3 38 2 4 4 CD cord-020871-1v6dcmt3 38 3 ] ] -RRB- cord-020871-1v6dcmt3 38 4 . . . cord-020871-1v6dcmt3 39 1 LDA LDA NNP cord-020871-1v6dcmt3 39 2 . . . cord-020871-1v6dcmt3 40 1 Latent Latent NNP cord-020871-1v6dcmt3 40 2 Dirichlet Dirichlet NNP cord-020871-1v6dcmt3 40 3 Allocation Allocation NNP cord-020871-1v6dcmt3 40 4 [ [ -LRB- cord-020871-1v6dcmt3 40 5 2 2 CD cord-020871-1v6dcmt3 40 6 ] ] -RRB- cord-020871-1v6dcmt3 40 7 . . . cord-020871-1v6dcmt3 40 8 cBoW. cbow. ADD cord-020871-1v6dcmt3 40 9 Word2vec word2vec ADD cord-020871-1v6dcmt3 41 1 [ [ -LRB- cord-020871-1v6dcmt3 41 2 9 9 CD cord-020871-1v6dcmt3 41 3 ] ] -RRB- cord-020871-1v6dcmt3 41 4 in in IN cord-020871-1v6dcmt3 41 5 the the DT cord-020871-1v6dcmt3 41 6 Continuous Continuous NNP cord-020871-1v6dcmt3 41 7 Bag Bag NNP cord-020871-1v6dcmt3 41 8 - - HYPH cord-020871-1v6dcmt3 41 9 of of IN cord-020871-1v6dcmt3 41 10 - - HYPH cord-020871-1v6dcmt3 41 11 Word word NN cord-020871-1v6dcmt3 41 12 ( ( -LRB- cord-020871-1v6dcmt3 41 13 cBow cBow NNP cord-020871-1v6dcmt3 41 14 ) ) -RRB- cord-020871-1v6dcmt3 41 15 architecture architecture NN cord-020871-1v6dcmt3 41 16 . . . cord-020871-1v6dcmt3 42 1 PV pv NN cord-020871-1v6dcmt3 42 2 - - HYPH cord-020871-1v6dcmt3 42 3 DBOW DBOW NNP cord-020871-1v6dcmt3 42 4 / / SYM cord-020871-1v6dcmt3 42 5 DM dm NN cord-020871-1v6dcmt3 42 6 . . . cord-020871-1v6dcmt3 43 1 Paragraph paragraph NN cord-020871-1v6dcmt3 43 2 vector vector NN cord-020871-1v6dcmt3 43 3 ( ( -LRB- cord-020871-1v6dcmt3 43 4 PV pv NN cord-020871-1v6dcmt3 43 5 ) ) -RRB- cord-020871-1v6dcmt3 43 6 is be VBZ cord-020871-1v6dcmt3 43 7 a a DT cord-020871-1v6dcmt3 43 8 document document NN cord-020871-1v6dcmt3 43 9 embedding embed VBG cord-020871-1v6dcmt3 43 10 algorithm algorithm NN cord-020871-1v6dcmt3 43 11 that that WDT cord-020871-1v6dcmt3 43 12 builds build VBZ cord-020871-1v6dcmt3 43 13 on on IN cord-020871-1v6dcmt3 43 14 Word2vec Word2vec NNP cord-020871-1v6dcmt3 43 15 . . . cord-020871-1v6dcmt3 44 1 We -PRON- PRP cord-020871-1v6dcmt3 44 2 use use VBP cord-020871-1v6dcmt3 44 3 here here RB cord-020871-1v6dcmt3 44 4 both both CC cord-020871-1v6dcmt3 44 5 its -PRON- PRP$ cord-020871-1v6dcmt3 44 6 implementations implementation NNS cord-020871-1v6dcmt3 44 7 : : : cord-020871-1v6dcmt3 44 8 Distributed distribute VBN cord-020871-1v6dcmt3 44 9 Bag Bag NNP cord-020871-1v6dcmt3 44 10 - - HYPH cord-020871-1v6dcmt3 44 11 of of IN cord-020871-1v6dcmt3 44 12 - - HYPH cord-020871-1v6dcmt3 44 13 Words word NNS cord-020871-1v6dcmt3 44 14 ( ( -LRB- cord-020871-1v6dcmt3 44 15 PV PV NNP cord-020871-1v6dcmt3 44 16 - - HYPH cord-020871-1v6dcmt3 44 17 DBOW DBOW NNP cord-020871-1v6dcmt3 44 18 ) ) -RRB- cord-020871-1v6dcmt3 44 19 and and CC cord-020871-1v6dcmt3 44 20 Distributed Distributed NNP cord-020871-1v6dcmt3 44 21 Memory Memory NNP cord-020871-1v6dcmt3 44 22 ( ( -LRB- cord-020871-1v6dcmt3 44 23 PV pv NN cord-020871-1v6dcmt3 44 24 - - HYPH cord-020871-1v6dcmt3 44 25 DM DM NNP cord-020871-1v6dcmt3 44 26 ) ) -RRB- cord-020871-1v6dcmt3 44 27 [ [ -LRB- cord-020871-1v6dcmt3 44 28 7 7 CD cord-020871-1v6dcmt3 44 29 ] ] -RRB- cord-020871-1v6dcmt3 44 30 . . . cord-020871-1v6dcmt3 45 1 The the DT cord-020871-1v6dcmt3 45 2 LSI LSI NNP cord-020871-1v6dcmt3 45 3 , , , cord-020871-1v6dcmt3 45 4 LDA LDA NNP cord-020871-1v6dcmt3 45 5 , , , cord-020871-1v6dcmt3 45 6 cBoW cBoW NNP cord-020871-1v6dcmt3 45 7 , , , cord-020871-1v6dcmt3 45 8 and and CC cord-020871-1v6dcmt3 45 9 PV pv NN cord-020871-1v6dcmt3 45 10 implementations implementation NNS cord-020871-1v6dcmt3 45 11 are be VBP cord-020871-1v6dcmt3 45 12 available available JJ cord-020871-1v6dcmt3 45 13 in in IN cord-020871-1v6dcmt3 45 14 the the DT cord-020871-1v6dcmt3 45 15 gensim gensim NNP cord-020871-1v6dcmt3 45 16 library library NNP cord-020871-1v6dcmt3 45 17 4 4 CD cord-020871-1v6dcmt3 45 18 . . . cord-020871-1v6dcmt3 46 1 The the DT cord-020871-1v6dcmt3 46 2 FK FK NNP cord-020871-1v6dcmt3 46 3 framework framework NN cord-020871-1v6dcmt3 46 4 offers offer VBZ cord-020871-1v6dcmt3 46 5 the the DT cord-020871-1v6dcmt3 46 6 option option NN cord-020871-1v6dcmt3 46 7 to to TO cord-020871-1v6dcmt3 46 8 aggregate aggregate VB cord-020871-1v6dcmt3 46 9 word word NN cord-020871-1v6dcmt3 46 10 embeddings embedding NNS cord-020871-1v6dcmt3 46 11 to to TO cord-020871-1v6dcmt3 46 12 obtain obtain VB cord-020871-1v6dcmt3 46 13 fixed fix VBN cord-020871-1v6dcmt3 46 14 - - HYPH cord-020871-1v6dcmt3 46 15 length length NN cord-020871-1v6dcmt3 46 16 representations representation NNS cord-020871-1v6dcmt3 46 17 of of IN cord-020871-1v6dcmt3 46 18 documents document NNS cord-020871-1v6dcmt3 46 19 . . . cord-020871-1v6dcmt3 47 1 We -PRON- PRP cord-020871-1v6dcmt3 47 2 use use VBP cord-020871-1v6dcmt3 47 3 Fisher Fisher NNP cord-020871-1v6dcmt3 47 4 vectors vector NNS cord-020871-1v6dcmt3 47 5 ( ( -LRB- cord-020871-1v6dcmt3 47 6 FV FV NNP cord-020871-1v6dcmt3 47 7 ) ) -RRB- cord-020871-1v6dcmt3 47 8 based base VBN cord-020871-1v6dcmt3 47 9 on on IN cord-020871-1v6dcmt3 47 10 ( ( -LRB- cord-020871-1v6dcmt3 47 11 i i NN cord-020871-1v6dcmt3 47 12 ) ) -RRB- cord-020871-1v6dcmt3 47 13 a a DT cord-020871-1v6dcmt3 47 14 Gaussian gaussian JJ cord-020871-1v6dcmt3 47 15 mixture mixture NN cord-020871-1v6dcmt3 47 16 model model NN cord-020871-1v6dcmt3 47 17 ( ( -LRB- cord-020871-1v6dcmt3 47 18 FV FV NNP cord-020871-1v6dcmt3 47 19 - - HYPH cord-020871-1v6dcmt3 47 20 GMM GMM NNP cord-020871-1v6dcmt3 47 21 ) ) -RRB- cord-020871-1v6dcmt3 47 22 and and CC cord-020871-1v6dcmt3 47 23 ( ( -LRB- cord-020871-1v6dcmt3 47 24 ii ii NN cord-020871-1v6dcmt3 47 25 ) ) -RRB- cord-020871-1v6dcmt3 47 26 a a DT cord-020871-1v6dcmt3 47 27 mixture mixture NN cord-020871-1v6dcmt3 47 28 of of IN cord-020871-1v6dcmt3 47 29 von von NNP cord-020871-1v6dcmt3 47 30 Mises Mises NNP cord-020871-1v6dcmt3 47 31 - - HYPH cord-020871-1v6dcmt3 47 32 Fisher Fisher NNP cord-020871-1v6dcmt3 47 33 distributions distribution NNS cord-020871-1v6dcmt3 47 34 ( ( -LRB- cord-020871-1v6dcmt3 47 35 FV fv NN cord-020871-1v6dcmt3 47 36 - - HYPH cord-020871-1v6dcmt3 47 37 moVMF movmf NN cord-020871-1v6dcmt3 47 38 ) ) -RRB- cord-020871-1v6dcmt3 47 39 [ [ -LRB- cord-020871-1v6dcmt3 47 40 1 1 CD cord-020871-1v6dcmt3 47 41 ] ] -RRB- cord-020871-1v6dcmt3 47 42 . . . cord-020871-1v6dcmt3 48 1 We -PRON- PRP cord-020871-1v6dcmt3 48 2 first first RB cord-020871-1v6dcmt3 48 3 fit fit VBP cord-020871-1v6dcmt3 48 4 ( ( -LRB- cord-020871-1v6dcmt3 48 5 i i NN cord-020871-1v6dcmt3 48 6 ) ) -RRB- cord-020871-1v6dcmt3 49 1 a a DT cord-020871-1v6dcmt3 49 2 GMM GMM NNP cord-020871-1v6dcmt3 49 3 and and CC cord-020871-1v6dcmt3 49 4 ( ( -LRB- cord-020871-1v6dcmt3 49 5 ii ii NN cord-020871-1v6dcmt3 49 6 ) ) -RRB- cord-020871-1v6dcmt3 49 7 a a DT cord-020871-1v6dcmt3 49 8 moVMF movmf NN cord-020871-1v6dcmt3 49 9 model model NN cord-020871-1v6dcmt3 49 10 on on IN cord-020871-1v6dcmt3 49 11 previously previously RB cord-020871-1v6dcmt3 49 12 learnt learn VBN cord-020871-1v6dcmt3 49 13 continuous continuous JJ cord-020871-1v6dcmt3 49 14 word word NN cord-020871-1v6dcmt3 49 15 embeddings embedding NNS cord-020871-1v6dcmt3 49 16 . . . cord-020871-1v6dcmt3 50 1 The the DT cord-020871-1v6dcmt3 50 2 fixed fix VBN cord-020871-1v6dcmt3 50 3 - - HYPH cord-020871-1v6dcmt3 50 4 length length NN cord-020871-1v6dcmt3 50 5 representation representation NN cord-020871-1v6dcmt3 50 6 of of IN cord-020871-1v6dcmt3 50 7 a a DT cord-020871-1v6dcmt3 50 8 document document NN cord-020871-1v6dcmt3 50 9 X x NN cord-020871-1v6dcmt3 50 10 containing contain VBG cord-020871-1v6dcmt3 50 11 T T NNP cord-020871-1v6dcmt3 50 12 words word NNS cord-020871-1v6dcmt3 51 1 w w NNP cord-020871-1v6dcmt3 51 2 i i PRP cord-020871-1v6dcmt3 52 1 -expressed -expressed JJ cord-020871-1v6dcmt3 53 1 as as IN cord-020871-1v6dcmt3 53 2 where where WRB cord-020871-1v6dcmt3 53 3 K K NNP cord-020871-1v6dcmt3 53 4 is be VBZ cord-020871-1v6dcmt3 53 5 the the DT cord-020871-1v6dcmt3 53 6 number number NN cord-020871-1v6dcmt3 53 7 of of IN cord-020871-1v6dcmt3 53 8 mixture mixture NN cord-020871-1v6dcmt3 53 9 components component NNS cord-020871-1v6dcmt3 53 10 . . . cord-020871-1v6dcmt3 54 1 The the DT cord-020871-1v6dcmt3 54 2 vectors vector NNS cord-020871-1v6dcmt3 54 3 G g NN cord-020871-1v6dcmt3 54 4 X x NN cord-020871-1v6dcmt3 54 5 i i NN cord-020871-1v6dcmt3 54 6 , , , cord-020871-1v6dcmt3 54 7 having have VBG cord-020871-1v6dcmt3 54 8 the the DT cord-020871-1v6dcmt3 54 9 dimension dimension NN cord-020871-1v6dcmt3 54 10 ( ( -LRB- cord-020871-1v6dcmt3 54 11 d d NN cord-020871-1v6dcmt3 54 12 ) ) -RRB- cord-020871-1v6dcmt3 54 13 of of IN cord-020871-1v6dcmt3 54 14 the the DT cord-020871-1v6dcmt3 54 15 word word NN cord-020871-1v6dcmt3 54 16 vectors vector NNS cord-020871-1v6dcmt3 54 17 E E NNP cord-020871-1v6dcmt3 54 18 wi wi NNP cord-020871-1v6dcmt3 54 19 , , , cord-020871-1v6dcmt3 54 20 are be VBP cord-020871-1v6dcmt3 54 21 explicitly explicitly RB cord-020871-1v6dcmt3 54 22 given give VBN cord-020871-1v6dcmt3 54 23 by by IN cord-020871-1v6dcmt3 54 24 [ [ -LRB- cord-020871-1v6dcmt3 54 25 3 3 CD cord-020871-1v6dcmt3 54 26 , , , cord-020871-1v6dcmt3 54 27 14 14 CD cord-020871-1v6dcmt3 54 28 ] ] -RRB- cord-020871-1v6dcmt3 54 29 : : : cord-020871-1v6dcmt3 54 30 where where WRB cord-020871-1v6dcmt3 54 31 ω ω NN cord-020871-1v6dcmt3 54 32 i i PRP cord-020871-1v6dcmt3 54 33 are be VBP cord-020871-1v6dcmt3 54 34 the the DT cord-020871-1v6dcmt3 54 35 mixture mixture NN cord-020871-1v6dcmt3 54 36 weights weight NNS cord-020871-1v6dcmt3 54 37 , , , cord-020871-1v6dcmt3 54 38 γ γ NNP cord-020871-1v6dcmt3 54 39 t t NN cord-020871-1v6dcmt3 54 40 ( ( -LRB- cord-020871-1v6dcmt3 54 41 i i NN cord-020871-1v6dcmt3 54 42 ) ) -RRB- cord-020871-1v6dcmt3 54 43 = = SYM cord-020871-1v6dcmt3 54 44 p(i|x p(i|x NNP cord-020871-1v6dcmt3 54 45 t t NN cord-020871-1v6dcmt3 54 46 ) ) -RRB- cord-020871-1v6dcmt3 54 47 is be VBZ cord-020871-1v6dcmt3 54 48 the the DT cord-020871-1v6dcmt3 54 49 soft soft JJ cord-020871-1v6dcmt3 54 50 assignment assignment NN cord-020871-1v6dcmt3 54 51 of of IN cord-020871-1v6dcmt3 54 52 x x SYM cord-020871-1v6dcmt3 54 53 t t NN cord-020871-1v6dcmt3 55 1 to to IN cord-020871-1v6dcmt3 55 2 ( ( -LRB- cord-020871-1v6dcmt3 55 3 i i NN cord-020871-1v6dcmt3 55 4 ) ) -RRB- cord-020871-1v6dcmt3 55 5 Gaussian Gaussian NNP cord-020871-1v6dcmt3 55 6 and and CC cord-020871-1v6dcmt3 55 7 ( ( -LRB- cord-020871-1v6dcmt3 55 8 ii ii NN cord-020871-1v6dcmt3 55 9 ) ) -RRB- cord-020871-1v6dcmt3 56 1 VMF VMF NNP cord-020871-1v6dcmt3 56 2 distribution distribution NN cord-020871-1v6dcmt3 56 3 i i PRP cord-020871-1v6dcmt3 56 4 , , , cord-020871-1v6dcmt3 56 5 and and CC cord-020871-1v6dcmt3 56 6 σ σ NNP cord-020871-1v6dcmt3 56 7 2 2 CD cord-020871-1v6dcmt3 56 8 i i NNP cord-020871-1v6dcmt3 56 9 = = SYM cord-020871-1v6dcmt3 56 10 diag(Σ diag(Σ NNS cord-020871-1v6dcmt3 56 11 i i PRP cord-020871-1v6dcmt3 56 12 ) ) -RRB- cord-020871-1v6dcmt3 56 13 , , , cord-020871-1v6dcmt3 56 14 with with IN cord-020871-1v6dcmt3 56 15 Σ Σ NNP cord-020871-1v6dcmt3 56 16 i i PRP cord-020871-1v6dcmt3 56 17 the the DT cord-020871-1v6dcmt3 56 18 covariance covariance NN cord-020871-1v6dcmt3 56 19 matrix matrix NN cord-020871-1v6dcmt3 56 20 of of IN cord-020871-1v6dcmt3 56 21 Gaussian Gaussian NNP cord-020871-1v6dcmt3 56 22 i. i. NN cord-020871-1v6dcmt3 57 1 In in IN cord-020871-1v6dcmt3 57 2 ( ( -LRB- cord-020871-1v6dcmt3 57 3 i i NN cord-020871-1v6dcmt3 57 4 ) ) -RRB- cord-020871-1v6dcmt3 57 5 , , , cord-020871-1v6dcmt3 57 6 σ σ NNP cord-020871-1v6dcmt3 58 1 i i PRP cord-020871-1v6dcmt3 58 2 refers refer VBZ cord-020871-1v6dcmt3 58 3 to to IN cord-020871-1v6dcmt3 58 4 the the DT cord-020871-1v6dcmt3 58 5 mean mean JJ cord-020871-1v6dcmt3 58 6 vector vector NN cord-020871-1v6dcmt3 58 7 ; ; : cord-020871-1v6dcmt3 58 8 in in IN cord-020871-1v6dcmt3 58 9 ( ( -LRB- cord-020871-1v6dcmt3 58 10 ii ii NN cord-020871-1v6dcmt3 58 11 ) ) -RRB- cord-020871-1v6dcmt3 58 12 it -PRON- PRP cord-020871-1v6dcmt3 58 13 indicates indicate VBZ cord-020871-1v6dcmt3 58 14 the the DT cord-020871-1v6dcmt3 58 15 mean mean JJ cord-020871-1v6dcmt3 58 16 direction direction NN cord-020871-1v6dcmt3 58 17 and and CC cord-020871-1v6dcmt3 58 18 κ κ NN cord-020871-1v6dcmt3 59 1 i i PRP cord-020871-1v6dcmt3 59 2 is be VBZ cord-020871-1v6dcmt3 59 3 the the DT cord-020871-1v6dcmt3 59 4 concentration concentration NN cord-020871-1v6dcmt3 59 5 parameter parameter NN cord-020871-1v6dcmt3 59 6 . . . cord-020871-1v6dcmt3 60 1 We -PRON- PRP cord-020871-1v6dcmt3 60 2 implement implement VBP cord-020871-1v6dcmt3 60 3 the the DT cord-020871-1v6dcmt3 60 4 FK FK NNP cord-020871-1v6dcmt3 60 5 - - HYPH cord-020871-1v6dcmt3 60 6 based base VBN cord-020871-1v6dcmt3 60 7 algorithms algorithm NNS cord-020871-1v6dcmt3 60 8 by by IN cord-020871-1v6dcmt3 60 9 ourselves -PRON- PRP cord-020871-1v6dcmt3 60 10 , , , cord-020871-1v6dcmt3 60 11 with with IN cord-020871-1v6dcmt3 60 12 the the DT cord-020871-1v6dcmt3 60 13 help help NN cord-020871-1v6dcmt3 60 14 of of IN cord-020871-1v6dcmt3 60 15 the the DT cord-020871-1v6dcmt3 60 16 scikit scikit NNS cord-020871-1v6dcmt3 60 17 - - HYPH cord-020871-1v6dcmt3 60 18 learn learn VB cord-020871-1v6dcmt3 60 19 library library NN cord-020871-1v6dcmt3 60 20 for for IN cord-020871-1v6dcmt3 60 21 fitting fit VBG cord-020871-1v6dcmt3 60 22 a a DT cord-020871-1v6dcmt3 60 23 mixture mixture NN cord-020871-1v6dcmt3 60 24 of of IN cord-020871-1v6dcmt3 60 25 Gaussian gaussian JJ cord-020871-1v6dcmt3 60 26 models model NNS cord-020871-1v6dcmt3 60 27 and and CC cord-020871-1v6dcmt3 60 28 of of IN cord-020871-1v6dcmt3 60 29 the the DT cord-020871-1v6dcmt3 60 30 Spherecluster Spherecluster NNP cord-020871-1v6dcmt3 60 31 package package NN cord-020871-1v6dcmt3 60 32 5 5 CD cord-020871-1v6dcmt3 60 33 for for IN cord-020871-1v6dcmt3 60 34 fitting fit VBG cord-020871-1v6dcmt3 60 35 a a DT cord-020871-1v6dcmt3 60 36 mixture mixture NN cord-020871-1v6dcmt3 60 37 of of IN cord-020871-1v6dcmt3 60 38 von von NNP cord-020871-1v6dcmt3 60 39 Mises Mises NNP cord-020871-1v6dcmt3 60 40 - - HYPH cord-020871-1v6dcmt3 60 41 Fisher Fisher NNP cord-020871-1v6dcmt3 60 42 distributions distribution NNS cord-020871-1v6dcmt3 60 43 to to IN cord-020871-1v6dcmt3 60 44 our -PRON- PRP$ cord-020871-1v6dcmt3 60 45 data datum NNS cord-020871-1v6dcmt3 60 46 . . . cord-020871-1v6dcmt3 61 1 The the DT cord-020871-1v6dcmt3 61 2 implementation implementation NN cord-020871-1v6dcmt3 61 3 details detail NNS cord-020871-1v6dcmt3 61 4 of of IN cord-020871-1v6dcmt3 61 5 each each DT cord-020871-1v6dcmt3 61 6 algorithm algorithm NN cord-020871-1v6dcmt3 61 7 are be VBP cord-020871-1v6dcmt3 61 8 described describe VBN cord-020871-1v6dcmt3 61 9 in in IN cord-020871-1v6dcmt3 61 10 what what WP cord-020871-1v6dcmt3 61 11 follows follow VBZ cord-020871-1v6dcmt3 61 12 . . . cord-020871-1v6dcmt3 62 1 Each each DT cord-020871-1v6dcmt3 62 2 of of IN cord-020871-1v6dcmt3 62 3 the the DT cord-020871-1v6dcmt3 62 4 following follow VBG cord-020871-1v6dcmt3 62 5 experiments experiment NNS cord-020871-1v6dcmt3 62 6 is be VBZ cord-020871-1v6dcmt3 62 7 conceptually conceptually RB cord-020871-1v6dcmt3 62 8 divided divide VBN cord-020871-1v6dcmt3 62 9 in in IN cord-020871-1v6dcmt3 62 10 three three CD cord-020871-1v6dcmt3 62 11 phases phase NNS cord-020871-1v6dcmt3 62 12 . . . cord-020871-1v6dcmt3 63 1 First first RB cord-020871-1v6dcmt3 63 2 , , , cord-020871-1v6dcmt3 63 3 text text NN cord-020871-1v6dcmt3 63 4 processing processing NN cord-020871-1v6dcmt3 63 5 ( ( -LRB- cord-020871-1v6dcmt3 63 6 e.g. e.g. RB cord-020871-1v6dcmt3 63 7 tokenisation tokenisation NN cord-020871-1v6dcmt3 63 8 ) ) -RRB- cord-020871-1v6dcmt3 63 9 ; ; : cord-020871-1v6dcmt3 63 10 second second RB cord-020871-1v6dcmt3 63 11 , , , cord-020871-1v6dcmt3 63 12 creating create VBG cord-020871-1v6dcmt3 63 13 a a DT cord-020871-1v6dcmt3 63 14 fixed fix VBN cord-020871-1v6dcmt3 63 15 - - HYPH cord-020871-1v6dcmt3 63 16 length length NN cord-020871-1v6dcmt3 63 17 vector vector NN cord-020871-1v6dcmt3 63 18 representation representation NN cord-020871-1v6dcmt3 63 19 for for IN cord-020871-1v6dcmt3 63 20 every every DT cord-020871-1v6dcmt3 63 21 document document NN cord-020871-1v6dcmt3 63 22 ; ; : cord-020871-1v6dcmt3 63 23 finally finally RB cord-020871-1v6dcmt3 63 24 , , , cord-020871-1v6dcmt3 63 25 the the DT cord-020871-1v6dcmt3 63 26 third third JJ cord-020871-1v6dcmt3 63 27 phase phase NN cord-020871-1v6dcmt3 63 28 is be VBZ cord-020871-1v6dcmt3 63 29 determined determine VBN cord-020871-1v6dcmt3 63 30 by by IN cord-020871-1v6dcmt3 63 31 the the DT cord-020871-1v6dcmt3 63 32 goal goal NN cord-020871-1v6dcmt3 63 33 to to TO cord-020871-1v6dcmt3 63 34 be be VB cord-020871-1v6dcmt3 63 35 achieved achieve VBN cord-020871-1v6dcmt3 63 36 , , , cord-020871-1v6dcmt3 63 37 i.e. i.e. FW cord-020871-1v6dcmt3 63 38 classification classification NN cord-020871-1v6dcmt3 63 39 , , , cord-020871-1v6dcmt3 63 40 clustering clustering NN cord-020871-1v6dcmt3 63 41 , , , cord-020871-1v6dcmt3 63 42 and and CC cord-020871-1v6dcmt3 63 43 retrieval retrieval NN cord-020871-1v6dcmt3 63 44 . . . cord-020871-1v6dcmt3 64 1 For for IN cord-020871-1v6dcmt3 64 2 the the DT cord-020871-1v6dcmt3 64 3 first first JJ cord-020871-1v6dcmt3 64 4 phase phase NN cord-020871-1v6dcmt3 64 5 the the DT cord-020871-1v6dcmt3 64 6 same same JJ cord-020871-1v6dcmt3 64 7 pre pre JJ cord-020871-1v6dcmt3 64 8 - - JJ cord-020871-1v6dcmt3 64 9 processing processing NN cord-020871-1v6dcmt3 64 10 is be VBZ cord-020871-1v6dcmt3 64 11 applied apply VBN cord-020871-1v6dcmt3 64 12 to to IN cord-020871-1v6dcmt3 64 13 all all DT cord-020871-1v6dcmt3 64 14 datasets dataset NNS cord-020871-1v6dcmt3 64 15 . . . cord-020871-1v6dcmt3 65 1 In in IN cord-020871-1v6dcmt3 65 2 the the DT cord-020871-1v6dcmt3 65 3 original original JJ cord-020871-1v6dcmt3 65 4 paper paper NN cord-020871-1v6dcmt3 65 5 , , , cord-020871-1v6dcmt3 65 6 this this DT cord-020871-1v6dcmt3 65 7 phase phase NN cord-020871-1v6dcmt3 65 8 was be VBD cord-020871-1v6dcmt3 65 9 only only RB cord-020871-1v6dcmt3 65 10 briefly briefly RB cord-020871-1v6dcmt3 65 11 described describe VBN cord-020871-1v6dcmt3 65 12 as as IN cord-020871-1v6dcmt3 65 13 tokenisation tokenisation NN cord-020871-1v6dcmt3 65 14 and and CC cord-020871-1v6dcmt3 65 15 stopword stopword NN cord-020871-1v6dcmt3 65 16 removal removal NN cord-020871-1v6dcmt3 65 17 . . . cord-020871-1v6dcmt3 66 1 It -PRON- PRP cord-020871-1v6dcmt3 66 2 is be VBZ cord-020871-1v6dcmt3 66 3 not not RB cord-020871-1v6dcmt3 66 4 given give VBN cord-020871-1v6dcmt3 66 5 what what WP cord-020871-1v6dcmt3 66 6 tokeniser tokeniser NNP cord-020871-1v6dcmt3 66 7 , , , cord-020871-1v6dcmt3 66 8 linguistic linguistic JJ cord-020871-1v6dcmt3 66 9 filters filter NNS cord-020871-1v6dcmt3 66 10 ( ( -LRB- cord-020871-1v6dcmt3 66 11 stemming stem VBG cord-020871-1v6dcmt3 66 12 , , , cord-020871-1v6dcmt3 66 13 lemmatisation lemmatisation NN cord-020871-1v6dcmt3 66 14 , , , cord-020871-1v6dcmt3 66 15 etc etc FW cord-020871-1v6dcmt3 66 16 . . . cord-020871-1v6dcmt3 66 17 ) ) -RRB- cord-020871-1v6dcmt3 66 18 , , , cord-020871-1v6dcmt3 66 19 or or CC cord-020871-1v6dcmt3 66 20 stop stop VB cord-020871-1v6dcmt3 66 21 word word NN cord-020871-1v6dcmt3 66 22 list list NN cord-020871-1v6dcmt3 66 23 were be VBD cord-020871-1v6dcmt3 66 24 used use VBN cord-020871-1v6dcmt3 66 25 . . . cord-020871-1v6dcmt3 67 1 Knowing know VBG cord-020871-1v6dcmt3 67 2 that that IN cord-020871-1v6dcmt3 67 3 the the DT cord-020871-1v6dcmt3 67 4 gensim gensim NNP cord-020871-1v6dcmt3 67 5 library library NNP cord-020871-1v6dcmt3 67 6 was be VBD cord-020871-1v6dcmt3 67 7 used use VBN cord-020871-1v6dcmt3 67 8 , , , cord-020871-1v6dcmt3 67 9 we -PRON- PRP cord-020871-1v6dcmt3 67 10 took take VBD cord-020871-1v6dcmt3 67 11 all all DT cord-020871-1v6dcmt3 67 12 standard standard JJ cord-020871-1v6dcmt3 67 13 parameters parameter NNS cord-020871-1v6dcmt3 67 14 ( ( -LRB- cord-020871-1v6dcmt3 67 15 see see VB cord-020871-1v6dcmt3 67 16 provided provide VBN cord-020871-1v6dcmt3 67 17 code code NN cord-020871-1v6dcmt3 67 18 6 6 CD cord-020871-1v6dcmt3 67 19 ) ) -RRB- cord-020871-1v6dcmt3 67 20 . . . cord-020871-1v6dcmt3 68 1 Gensim Gensim NNP cord-020871-1v6dcmt3 68 2 however however RB cord-020871-1v6dcmt3 68 3 does do VBZ cord-020871-1v6dcmt3 68 4 not not RB cord-020871-1v6dcmt3 68 5 come come VB cord-020871-1v6dcmt3 68 6 with with IN cord-020871-1v6dcmt3 68 7 a a DT cord-020871-1v6dcmt3 68 8 pre pre JJ cord-020871-1v6dcmt3 68 9 - - JJ cord-020871-1v6dcmt3 68 10 defined define VBN cord-020871-1v6dcmt3 68 11 stopword stopword NN cord-020871-1v6dcmt3 68 12 list list NN cord-020871-1v6dcmt3 68 13 , , , cord-020871-1v6dcmt3 68 14 and and CC cord-020871-1v6dcmt3 68 15 therefore therefore RB cord-020871-1v6dcmt3 68 16 , , , cord-020871-1v6dcmt3 68 17 based base VBN cord-020871-1v6dcmt3 68 18 on on IN cord-020871-1v6dcmt3 68 19 our -PRON- PRP$ cord-020871-1v6dcmt3 68 20 own own JJ cord-020871-1v6dcmt3 68 21 experience experience NN cord-020871-1v6dcmt3 68 22 , , , cord-020871-1v6dcmt3 68 23 we -PRON- PRP cord-020871-1v6dcmt3 68 24 used use VBD cord-020871-1v6dcmt3 68 25 the the DT cord-020871-1v6dcmt3 68 26 one one NN cord-020871-1v6dcmt3 68 27 provided provide VBN cord-020871-1v6dcmt3 68 28 in in IN cord-020871-1v6dcmt3 68 29 the the DT cord-020871-1v6dcmt3 68 30 NLTK NLTK NNP cord-020871-1v6dcmt3 68 31 library library NN cord-020871-1v6dcmt3 68 32 7 7 CD cord-020871-1v6dcmt3 68 33 for for IN cord-020871-1v6dcmt3 68 34 English English NNP cord-020871-1v6dcmt3 68 35 . . . cord-020871-1v6dcmt3 69 1 For for IN cord-020871-1v6dcmt3 69 2 the the DT cord-020871-1v6dcmt3 69 3 second second JJ cord-020871-1v6dcmt3 69 4 phase phase NN cord-020871-1v6dcmt3 69 5 , , , cord-020871-1v6dcmt3 69 6 transforming transform VBG cord-020871-1v6dcmt3 69 7 terms term NNS cord-020871-1v6dcmt3 69 8 and and CC cord-020871-1v6dcmt3 69 9 documents document NNS cord-020871-1v6dcmt3 69 10 to to IN cord-020871-1v6dcmt3 69 11 vectors vector NNS cord-020871-1v6dcmt3 69 12 , , , cord-020871-1v6dcmt3 69 13 Zhang Zhang NNP cord-020871-1v6dcmt3 69 14 et et NNP cord-020871-1v6dcmt3 69 15 al al NNP cord-020871-1v6dcmt3 69 16 . . . cord-020871-1v6dcmt3 70 1 [ [ -LRB- cord-020871-1v6dcmt3 70 2 14 14 CD cord-020871-1v6dcmt3 70 3 ] ] -RRB- cord-020871-1v6dcmt3 70 4 specify specify VBP cord-020871-1v6dcmt3 70 5 that that IN cord-020871-1v6dcmt3 70 6 all all DT cord-020871-1v6dcmt3 70 7 trained train VBN cord-020871-1v6dcmt3 70 8 models model NNS cord-020871-1v6dcmt3 70 9 are be VBP cord-020871-1v6dcmt3 70 10 50 50 CD cord-020871-1v6dcmt3 70 11 dimensional dimensional JJ cord-020871-1v6dcmt3 70 12 . . . cord-020871-1v6dcmt3 71 1 We -PRON- PRP cord-020871-1v6dcmt3 71 2 have have VBP cord-020871-1v6dcmt3 71 3 additionally additionally RB cord-020871-1v6dcmt3 71 4 experimented experiment VBN cord-020871-1v6dcmt3 71 5 with with IN cord-020871-1v6dcmt3 71 6 dimensionality dimensionality NN cord-020871-1v6dcmt3 71 7 20 20 CD cord-020871-1v6dcmt3 71 8 ( ( -LRB- cord-020871-1v6dcmt3 71 9 used use VBN cord-020871-1v6dcmt3 71 10 by by IN cord-020871-1v6dcmt3 71 11 Clinchant Clinchant NNP cord-020871-1v6dcmt3 71 12 and and CC cord-020871-1v6dcmt3 71 13 Perronnin Perronnin NNP cord-020871-1v6dcmt3 71 14 [ [ -LRB- cord-020871-1v6dcmt3 71 15 3 3 CD cord-020871-1v6dcmt3 71 16 ] ] -RRB- cord-020871-1v6dcmt3 71 17 for for IN cord-020871-1v6dcmt3 71 18 clustering cluster VBG cord-020871-1v6dcmt3 71 19 ) ) -RRB- cord-020871-1v6dcmt3 71 20 and and CC cord-020871-1v6dcmt3 71 21 100 100 CD cord-020871-1v6dcmt3 71 22 , , , cord-020871-1v6dcmt3 71 23 as as IN cord-020871-1v6dcmt3 71 24 we -PRON- PRP cord-020871-1v6dcmt3 71 25 hypothesized hypothesize VBD cord-020871-1v6dcmt3 71 26 that that IN cord-020871-1v6dcmt3 71 27 50 50 CD cord-020871-1v6dcmt3 71 28 might may MD cord-020871-1v6dcmt3 71 29 be be VB cord-020871-1v6dcmt3 71 30 too too RB cord-020871-1v6dcmt3 71 31 low low JJ cord-020871-1v6dcmt3 71 32 . . . cord-020871-1v6dcmt3 72 1 The the DT cord-020871-1v6dcmt3 72 2 TF TF NNP cord-020871-1v6dcmt3 72 3 - - HYPH cord-020871-1v6dcmt3 72 4 IDF IDF NNP cord-020871-1v6dcmt3 72 5 model model NN cord-020871-1v6dcmt3 72 6 is be VBZ cord-020871-1v6dcmt3 72 7 5000 5000 CD cord-020871-1v6dcmt3 72 8 dimensional dimensional JJ cord-020871-1v6dcmt3 72 9 ( ( -LRB- cord-020871-1v6dcmt3 72 10 i.e. i.e. FW cord-020871-1v6dcmt3 72 11 only only RB cord-020871-1v6dcmt3 72 12 the the DT cord-020871-1v6dcmt3 72 13 top top JJ cord-020871-1v6dcmt3 72 14 5000 5000 CD cord-020871-1v6dcmt3 72 15 terms term NNS cord-020871-1v6dcmt3 72 16 based base VBN cord-020871-1v6dcmt3 72 17 on on IN cord-020871-1v6dcmt3 72 18 their -PRON- PRP$ cord-020871-1v6dcmt3 72 19 tf tf NNP cord-020871-1v6dcmt3 72 20 - - HYPH cord-020871-1v6dcmt3 72 21 idf idf NNP cord-020871-1v6dcmt3 72 22 value value NN cord-020871-1v6dcmt3 72 23 are be VBP cord-020871-1v6dcmt3 72 24 used use VBN cord-020871-1v6dcmt3 72 25 ) ) -RRB- cord-020871-1v6dcmt3 72 26 , , , cord-020871-1v6dcmt3 72 27 while while IN cord-020871-1v6dcmt3 72 28 the the DT cord-020871-1v6dcmt3 72 29 Fischer Fischer NNP cord-020871-1v6dcmt3 72 30 - - HYPH cord-020871-1v6dcmt3 72 31 Kernel Kernel NNP cord-020871-1v6dcmt3 72 32 models model NNS cord-020871-1v6dcmt3 72 33 are be VBP cord-020871-1v6dcmt3 72 34 15 15 CD cord-020871-1v6dcmt3 72 35 × × NN cord-020871-1v6dcmt3 72 36 d d NN cord-020871-1v6dcmt3 72 37 dimensional dimensional JJ cord-020871-1v6dcmt3 72 38 , , , cord-020871-1v6dcmt3 72 39 where where WRB cord-020871-1v6dcmt3 72 40 d d NN cord-020871-1v6dcmt3 72 41 = = NN cord-020871-1v6dcmt3 72 42 { { -LRB- cord-020871-1v6dcmt3 72 43 20 20 CD cord-020871-1v6dcmt3 72 44 , , , cord-020871-1v6dcmt3 72 45 50 50 CD cord-020871-1v6dcmt3 72 46 , , , cord-020871-1v6dcmt3 72 47 100 100 CD cord-020871-1v6dcmt3 72 48 } } -RRB- cord-020871-1v6dcmt3 72 49 , , , cord-020871-1v6dcmt3 72 50 as as IN cord-020871-1v6dcmt3 72 51 just just RB cord-020871-1v6dcmt3 72 52 explained explain VBN cord-020871-1v6dcmt3 72 53 . . . cord-020871-1v6dcmt3 73 1 In in IN cord-020871-1v6dcmt3 73 2 what what WP cord-020871-1v6dcmt3 73 3 follows follow VBZ cord-020871-1v6dcmt3 73 4 , , , cord-020871-1v6dcmt3 73 5 d d NNP cord-020871-1v6dcmt3 73 6 refers refer VBZ cord-020871-1v6dcmt3 73 7 to to IN cord-020871-1v6dcmt3 73 8 the the DT cord-020871-1v6dcmt3 73 9 dimensionality dimensionality NN cord-020871-1v6dcmt3 73 10 of of IN cord-020871-1v6dcmt3 73 11 LSI LSI NNP cord-020871-1v6dcmt3 73 12 , , , cord-020871-1v6dcmt3 73 13 LDA LDA NNP cord-020871-1v6dcmt3 73 14 , , , cord-020871-1v6dcmt3 73 15 cBow cBow NNP cord-020871-1v6dcmt3 73 16 , , , cord-020871-1v6dcmt3 73 17 and and CC cord-020871-1v6dcmt3 73 18 PV pv NN cord-020871-1v6dcmt3 73 19 models model NNS cord-020871-1v6dcmt3 73 20 . . . cord-020871-1v6dcmt3 74 1 The the DT cord-020871-1v6dcmt3 74 2 cBoW cbow NN cord-020871-1v6dcmt3 74 3 and and CC cord-020871-1v6dcmt3 74 4 PV pv NN cord-020871-1v6dcmt3 74 5 models model NNS cord-020871-1v6dcmt3 74 6 are be VBP cord-020871-1v6dcmt3 74 7 trained train VBN cord-020871-1v6dcmt3 74 8 using use VBG cord-020871-1v6dcmt3 74 9 a a DT cord-020871-1v6dcmt3 74 10 default default NN cord-020871-1v6dcmt3 74 11 window window NN cord-020871-1v6dcmt3 74 12 size size NN cord-020871-1v6dcmt3 74 13 of of IN cord-020871-1v6dcmt3 74 14 5 5 CD cord-020871-1v6dcmt3 74 15 , , , cord-020871-1v6dcmt3 74 16 keeping keep VBG cord-020871-1v6dcmt3 74 17 both both CC cord-020871-1v6dcmt3 74 18 low low JJ cord-020871-1v6dcmt3 74 19 and and CC cord-020871-1v6dcmt3 74 20 high high JJ cord-020871-1v6dcmt3 74 21 - - HYPH cord-020871-1v6dcmt3 74 22 frequency frequency NN cord-020871-1v6dcmt3 74 23 terms term NNS cord-020871-1v6dcmt3 74 24 , , , cord-020871-1v6dcmt3 74 25 again again RB cord-020871-1v6dcmt3 74 26 following follow VBG cord-020871-1v6dcmt3 74 27 the the DT cord-020871-1v6dcmt3 74 28 setup setup NN cord-020871-1v6dcmt3 74 29 of of IN cord-020871-1v6dcmt3 74 30 the the DT cord-020871-1v6dcmt3 74 31 original original JJ cord-020871-1v6dcmt3 74 32 experiment experiment NN cord-020871-1v6dcmt3 74 33 . . . cord-020871-1v6dcmt3 75 1 The the DT cord-020871-1v6dcmt3 75 2 LDA LDA NNP cord-020871-1v6dcmt3 75 3 model model NN cord-020871-1v6dcmt3 75 4 is be VBZ cord-020871-1v6dcmt3 75 5 trained train VBN cord-020871-1v6dcmt3 75 6 using use VBG cord-020871-1v6dcmt3 75 7 a a DT cord-020871-1v6dcmt3 75 8 chunk chunk NN cord-020871-1v6dcmt3 75 9 size size NN cord-020871-1v6dcmt3 75 10 of of IN cord-020871-1v6dcmt3 75 11 1000 1000 CD cord-020871-1v6dcmt3 75 12 documents document NNS cord-020871-1v6dcmt3 75 13 and and CC cord-020871-1v6dcmt3 75 14 for for IN cord-020871-1v6dcmt3 75 15 a a DT cord-020871-1v6dcmt3 75 16 number number NN cord-020871-1v6dcmt3 75 17 of of IN cord-020871-1v6dcmt3 75 18 iterations iteration NNS cord-020871-1v6dcmt3 75 19 over over IN cord-020871-1v6dcmt3 75 20 the the DT cord-020871-1v6dcmt3 75 21 corpus corpus NN cord-020871-1v6dcmt3 75 22 ranging range VBG cord-020871-1v6dcmt3 75 23 from from IN cord-020871-1v6dcmt3 75 24 20 20 CD cord-020871-1v6dcmt3 75 25 to to TO cord-020871-1v6dcmt3 75 26 100 100 CD cord-020871-1v6dcmt3 75 27 . . . cord-020871-1v6dcmt3 76 1 For for IN cord-020871-1v6dcmt3 76 2 the the DT cord-020871-1v6dcmt3 76 3 FK FK NNP cord-020871-1v6dcmt3 76 4 methods method NNS cord-020871-1v6dcmt3 76 5 , , , cord-020871-1v6dcmt3 76 6 both both DT cord-020871-1v6dcmt3 76 7 fitting fitting JJ cord-020871-1v6dcmt3 76 8 procedures procedure NNS cord-020871-1v6dcmt3 76 9 ( ( -LRB- cord-020871-1v6dcmt3 76 10 GMM GMM NNP cord-020871-1v6dcmt3 76 11 and and CC cord-020871-1v6dcmt3 76 12 moVMF movmf NN cord-020871-1v6dcmt3 76 13 ) ) -RRB- cord-020871-1v6dcmt3 76 14 are be VBP cord-020871-1v6dcmt3 76 15 independently independently RB cord-020871-1v6dcmt3 76 16 initialised initialise VBN cord-020871-1v6dcmt3 76 17 10 10 CD cord-020871-1v6dcmt3 76 18 times time NNS cord-020871-1v6dcmt3 76 19 and and CC cord-020871-1v6dcmt3 76 20 the the DT cord-020871-1v6dcmt3 76 21 best good JJS cord-020871-1v6dcmt3 76 22 fitting fitting JJ cord-020871-1v6dcmt3 76 23 model model NN cord-020871-1v6dcmt3 76 24 is be VBZ cord-020871-1v6dcmt3 76 25 kept keep VBN cord-020871-1v6dcmt3 76 26 . . . cord-020871-1v6dcmt3 77 1 For for IN cord-020871-1v6dcmt3 77 2 the the DT cord-020871-1v6dcmt3 77 3 third third JJ cord-020871-1v6dcmt3 77 4 phase phase NN cord-020871-1v6dcmt3 77 5 , , , cord-020871-1v6dcmt3 77 6 parameters parameter NNS cord-020871-1v6dcmt3 77 7 are be VBP cord-020871-1v6dcmt3 77 8 explained explain VBN cord-020871-1v6dcmt3 77 9 in in IN cord-020871-1v6dcmt3 77 10 the the DT cord-020871-1v6dcmt3 77 11 following follow VBG cord-020871-1v6dcmt3 77 12 sections section NNS cord-020871-1v6dcmt3 77 13 . . . cord-020871-1v6dcmt3 78 1 Logistic logistic JJ cord-020871-1v6dcmt3 78 2 regression regression NN cord-020871-1v6dcmt3 78 3 is be VBZ cord-020871-1v6dcmt3 78 4 used use VBN cord-020871-1v6dcmt3 78 5 for for IN cord-020871-1v6dcmt3 78 6 classification classification NN cord-020871-1v6dcmt3 78 7 in in IN cord-020871-1v6dcmt3 78 8 Zhang Zhang NNP cord-020871-1v6dcmt3 78 9 et et NNP cord-020871-1v6dcmt3 78 10 al al NNP cord-020871-1v6dcmt3 78 11 . . NNP cord-020871-1v6dcmt3 78 12 , , , cord-020871-1v6dcmt3 78 13 and and CC cord-020871-1v6dcmt3 78 14 therefore therefore RB cord-020871-1v6dcmt3 78 15 also also RB cord-020871-1v6dcmt3 78 16 used use VBD cord-020871-1v6dcmt3 78 17 here here RB cord-020871-1v6dcmt3 78 18 . . . cord-020871-1v6dcmt3 79 1 The the DT cord-020871-1v6dcmt3 79 2 results result NNS cord-020871-1v6dcmt3 79 3 of of IN cord-020871-1v6dcmt3 79 4 our -PRON- PRP$ cord-020871-1v6dcmt3 79 5 experiments experiment NNS cord-020871-1v6dcmt3 79 6 , , , cord-020871-1v6dcmt3 79 7 for for IN cord-020871-1v6dcmt3 79 8 d d NN cord-020871-1v6dcmt3 79 9 = = SYM cord-020871-1v6dcmt3 79 10 50 50 CD cord-020871-1v6dcmt3 79 11 and and CC cord-020871-1v6dcmt3 79 12 100-dimensional 100-dimensional CD cord-020871-1v6dcmt3 79 13 feature feature NN cord-020871-1v6dcmt3 79 14 vectors vector NNS cord-020871-1v6dcmt3 79 15 , , , cord-020871-1v6dcmt3 79 16 are be VBP cord-020871-1v6dcmt3 79 17 summarised summarise VBN cord-020871-1v6dcmt3 79 18 in in IN cord-020871-1v6dcmt3 79 19 Table Table NNP cord-020871-1v6dcmt3 79 20 1 1 CD cord-020871-1v6dcmt3 79 21 . . . cord-020871-1v6dcmt3 80 1 For for IN cord-020871-1v6dcmt3 80 2 all all PDT cord-020871-1v6dcmt3 80 3 the the DT cord-020871-1v6dcmt3 80 4 methods method NNS cord-020871-1v6dcmt3 80 5 , , , cord-020871-1v6dcmt3 80 6 we -PRON- PRP cord-020871-1v6dcmt3 80 7 perform perform VBP cord-020871-1v6dcmt3 80 8 a a DT cord-020871-1v6dcmt3 80 9 parameter parameter NN cord-020871-1v6dcmt3 80 10 scan scan NN cord-020871-1v6dcmt3 80 11 of of IN cord-020871-1v6dcmt3 80 12 the the DT cord-020871-1v6dcmt3 80 13 ( ( -LRB- cord-020871-1v6dcmt3 80 14 inverse inverse NN cord-020871-1v6dcmt3 80 15 ) ) -RRB- cord-020871-1v6dcmt3 80 16 regularisation regularisation NN cord-020871-1v6dcmt3 80 17 strength strength NN cord-020871-1v6dcmt3 80 18 of of IN cord-020871-1v6dcmt3 80 19 the the DT cord-020871-1v6dcmt3 80 20 logistic logistic JJ cord-020871-1v6dcmt3 80 21 regression regression NN cord-020871-1v6dcmt3 80 22 classifier classifier NN cord-020871-1v6dcmt3 80 23 , , , cord-020871-1v6dcmt3 80 24 as as IN cord-020871-1v6dcmt3 80 25 shown show VBN cord-020871-1v6dcmt3 80 26 in in IN cord-020871-1v6dcmt3 80 27 Fig Fig NNP cord-020871-1v6dcmt3 80 28 . . NNP cord-020871-1v6dcmt3 80 29 1(a 1(a CD cord-020871-1v6dcmt3 80 30 ) ) -RRB- cord-020871-1v6dcmt3 81 1 and and CC cord-020871-1v6dcmt3 81 2 ( ( -LRB- cord-020871-1v6dcmt3 81 3 b b NN cord-020871-1v6dcmt3 81 4 ) ) -RRB- cord-020871-1v6dcmt3 81 5 . . . cord-020871-1v6dcmt3 82 1 Additionally additionally RB cord-020871-1v6dcmt3 82 2 , , , cord-020871-1v6dcmt3 82 3 the the DT cord-020871-1v6dcmt3 82 4 learning learn VBG cord-020871-1v6dcmt3 82 5 algorithms algorithm NNS cord-020871-1v6dcmt3 82 6 are be VBP cord-020871-1v6dcmt3 82 7 trained train VBN cord-020871-1v6dcmt3 82 8 for for IN cord-020871-1v6dcmt3 82 9 a a DT cord-020871-1v6dcmt3 82 10 different different JJ cord-020871-1v6dcmt3 82 11 number number NN cord-020871-1v6dcmt3 82 12 of of IN cord-020871-1v6dcmt3 82 13 epochs epoch NNS cord-020871-1v6dcmt3 82 14 and and CC cord-020871-1v6dcmt3 82 15 the the DT cord-020871-1v6dcmt3 82 16 resulting result VBG cord-020871-1v6dcmt3 82 17 classification classification NN cord-020871-1v6dcmt3 82 18 accuracy accuracy NN cord-020871-1v6dcmt3 82 19 assessed assess VBN cord-020871-1v6dcmt3 82 20 , , , cord-020871-1v6dcmt3 82 21 cf cf NNP cord-020871-1v6dcmt3 82 22 . . . cord-020871-1v6dcmt3 83 1 Fig Fig NNP cord-020871-1v6dcmt3 83 2 . . . cord-020871-1v6dcmt3 83 3 1(c 1(c CD cord-020871-1v6dcmt3 83 4 ) ) -RRB- cord-020871-1v6dcmt3 83 5 and and CC cord-020871-1v6dcmt3 83 6 ( ( -LRB- cord-020871-1v6dcmt3 83 7 d d NN cord-020871-1v6dcmt3 83 8 ) ) -RRB- cord-020871-1v6dcmt3 83 9 . . . cord-020871-1v6dcmt3 84 1 Figure figure NN cord-020871-1v6dcmt3 84 2 1 1 CD cord-020871-1v6dcmt3 84 3 ( ( -LRB- cord-020871-1v6dcmt3 84 4 a a NN cord-020871-1v6dcmt3 84 5 ) ) -RRB- cord-020871-1v6dcmt3 84 6 indicates indicate VBZ cord-020871-1v6dcmt3 84 7 that that IN cord-020871-1v6dcmt3 84 8 cBow cBow NNP cord-020871-1v6dcmt3 84 9 , , , cord-020871-1v6dcmt3 84 10 FV FV NNP cord-020871-1v6dcmt3 84 11 - - HYPH cord-020871-1v6dcmt3 84 12 GMM GMM NNP cord-020871-1v6dcmt3 84 13 , , , cord-020871-1v6dcmt3 84 14 FV FV NNP cord-020871-1v6dcmt3 84 15 - - HYPH cord-020871-1v6dcmt3 84 16 moVMF movmf NN cord-020871-1v6dcmt3 84 17 , , , cord-020871-1v6dcmt3 84 18 and and CC cord-020871-1v6dcmt3 84 19 the the DT cord-020871-1v6dcmt3 84 20 simple simple JJ cord-020871-1v6dcmt3 84 21 TF TF NNP cord-020871-1v6dcmt3 84 22 - - HYPH cord-020871-1v6dcmt3 84 23 IDF IDF NNP cord-020871-1v6dcmt3 84 24 , , , cord-020871-1v6dcmt3 84 25 when when WRB cord-020871-1v6dcmt3 84 26 properly properly RB cord-020871-1v6dcmt3 84 27 tuned tune VBN cord-020871-1v6dcmt3 84 28 , , , cord-020871-1v6dcmt3 84 29 exhibit exhibit VBP cord-020871-1v6dcmt3 84 30 a a DT cord-020871-1v6dcmt3 84 31 very very RB cord-020871-1v6dcmt3 84 32 similar similar JJ cord-020871-1v6dcmt3 84 33 accuracy accuracy NN cord-020871-1v6dcmt3 84 34 on on IN cord-020871-1v6dcmt3 84 35 subj subj NNP cord-020871-1v6dcmt3 85 1 -the -the DT cord-020871-1v6dcmt3 85 2 given give VBN cord-020871-1v6dcmt3 85 3 confidence confidence NN cord-020871-1v6dcmt3 85 4 intervals interval NNS cord-020871-1v6dcmt3 85 5 do do VBP cord-020871-1v6dcmt3 85 6 not not RB cord-020871-1v6dcmt3 85 7 indeed indeed RB cord-020871-1v6dcmt3 85 8 allow allow VB cord-020871-1v6dcmt3 85 9 us -PRON- PRP cord-020871-1v6dcmt3 85 10 to to TO cord-020871-1v6dcmt3 85 11 identify identify VB cord-020871-1v6dcmt3 85 12 a a DT cord-020871-1v6dcmt3 85 13 single single JJ cord-020871-1v6dcmt3 85 14 , , , cord-020871-1v6dcmt3 85 15 best good JJS cord-020871-1v6dcmt3 85 16 model model NN cord-020871-1v6dcmt3 85 17 . . . cord-020871-1v6dcmt3 86 1 Surprisingly surprisingly RB cord-020871-1v6dcmt3 86 2 , , , cord-020871-1v6dcmt3 86 3 TF TF NNP cord-020871-1v6dcmt3 86 4 - - HYPH cord-020871-1v6dcmt3 86 5 IDF IDF NNP cord-020871-1v6dcmt3 86 6 outperforms outperform VBZ cord-020871-1v6dcmt3 86 7 all all PDT cord-020871-1v6dcmt3 86 8 the the DT cord-020871-1v6dcmt3 86 9 others other NNS cord-020871-1v6dcmt3 86 10 on on IN cord-020871-1v6dcmt3 86 11 the the DT cord-020871-1v6dcmt3 86 12 sent send VBN cord-020871-1v6dcmt3 86 13 dataset dataset NN cord-020871-1v6dcmt3 86 14 ( ( -LRB- cord-020871-1v6dcmt3 86 15 Fig Fig NNP cord-020871-1v6dcmt3 86 16 . . . cord-020871-1v6dcmt3 86 17 1(b 1(b CD cord-020871-1v6dcmt3 86 18 ) ) -RRB- cord-020871-1v6dcmt3 86 19 ) ) -RRB- cord-020871-1v6dcmt3 86 20 . . . cord-020871-1v6dcmt3 87 1 Increasing increase VBG cord-020871-1v6dcmt3 87 2 the the DT cord-020871-1v6dcmt3 87 3 dimensionality dimensionality NN cord-020871-1v6dcmt3 87 4 of of IN cord-020871-1v6dcmt3 87 5 the the DT cord-020871-1v6dcmt3 87 6 feature feature NN cord-020871-1v6dcmt3 87 7 vectors vector NNS cord-020871-1v6dcmt3 87 8 , , , cord-020871-1v6dcmt3 87 9 from from IN cord-020871-1v6dcmt3 87 10 d d NN cord-020871-1v6dcmt3 87 11 = = SYM cord-020871-1v6dcmt3 87 12 50 50 CD cord-020871-1v6dcmt3 87 13 to to TO cord-020871-1v6dcmt3 87 14 100 100 CD cord-020871-1v6dcmt3 87 15 , , , cord-020871-1v6dcmt3 87 16 has have VBZ cord-020871-1v6dcmt3 87 17 the the DT cord-020871-1v6dcmt3 87 18 effect effect NN cord-020871-1v6dcmt3 87 19 of of IN cord-020871-1v6dcmt3 87 20 reducing reduce VBG cord-020871-1v6dcmt3 87 21 the the DT cord-020871-1v6dcmt3 87 22 gap gap NN cord-020871-1v6dcmt3 87 23 between between IN cord-020871-1v6dcmt3 87 24 TF TF NNP cord-020871-1v6dcmt3 87 25 - - HYPH cord-020871-1v6dcmt3 87 26 IDF IDF NNP cord-020871-1v6dcmt3 87 27 and and CC cord-020871-1v6dcmt3 87 28 the the DT cord-020871-1v6dcmt3 87 29 rest rest NN cord-020871-1v6dcmt3 87 30 of of IN cord-020871-1v6dcmt3 87 31 the the DT cord-020871-1v6dcmt3 87 32 models model NNS cord-020871-1v6dcmt3 87 33 on on IN cord-020871-1v6dcmt3 87 34 the the DT cord-020871-1v6dcmt3 87 35 sent send VBN cord-020871-1v6dcmt3 87 36 dataset dataset NN cord-020871-1v6dcmt3 87 37 ( ( -LRB- cord-020871-1v6dcmt3 87 38 see see VB cord-020871-1v6dcmt3 87 39 Table table NN cord-020871-1v6dcmt3 87 40 1 1 CD cord-020871-1v6dcmt3 87 41 ) ) -RRB- cord-020871-1v6dcmt3 87 42 . . . cord-020871-1v6dcmt3 88 1 For for IN cord-020871-1v6dcmt3 88 2 clustering cluster VBG cord-020871-1v6dcmt3 88 3 experiments experiment NNS cord-020871-1v6dcmt3 88 4 , , , cord-020871-1v6dcmt3 88 5 the the DT cord-020871-1v6dcmt3 88 6 obtained obtain VBN cord-020871-1v6dcmt3 88 7 feature feature NN cord-020871-1v6dcmt3 88 8 vectors vector NNS cord-020871-1v6dcmt3 88 9 are be VBP cord-020871-1v6dcmt3 88 10 passed pass VBN cord-020871-1v6dcmt3 88 11 to to IN cord-020871-1v6dcmt3 88 12 the the DT cord-020871-1v6dcmt3 88 13 kmeans kmeans NNP cord-020871-1v6dcmt3 88 14 algorithm algorithm NN cord-020871-1v6dcmt3 88 15 . . . cord-020871-1v6dcmt3 89 1 The the DT cord-020871-1v6dcmt3 89 2 results result NNS cord-020871-1v6dcmt3 89 3 of of IN cord-020871-1v6dcmt3 89 4 our -PRON- PRP$ cord-020871-1v6dcmt3 89 5 experiments experiment NNS cord-020871-1v6dcmt3 89 6 , , , cord-020871-1v6dcmt3 89 7 measured measure VBN cord-020871-1v6dcmt3 89 8 in in IN cord-020871-1v6dcmt3 89 9 terms term NNS cord-020871-1v6dcmt3 89 10 of of IN cord-020871-1v6dcmt3 89 11 Adjusted Adjusted NNP cord-020871-1v6dcmt3 89 12 Rand Rand NNP cord-020871-1v6dcmt3 89 13 Index Index NNP cord-020871-1v6dcmt3 89 14 ( ( -LRB- cord-020871-1v6dcmt3 89 15 ARI ARI NNP cord-020871-1v6dcmt3 89 16 ) ) -RRB- cord-020871-1v6dcmt3 89 17 and and CC cord-020871-1v6dcmt3 89 18 Normalized Normalized NNP cord-020871-1v6dcmt3 89 19 Mutual Mutual NNP cord-020871-1v6dcmt3 89 20 Information Information NNP cord-020871-1v6dcmt3 89 21 ( ( -LRB- cord-020871-1v6dcmt3 89 22 NMI NMI NNP cord-020871-1v6dcmt3 89 23 ) ) -RRB- cord-020871-1v6dcmt3 89 24 , , , cord-020871-1v6dcmt3 89 25 are be VBP cord-020871-1v6dcmt3 89 26 summarised summarise VBN cord-020871-1v6dcmt3 89 27 in in IN cord-020871-1v6dcmt3 89 28 Table Table NNP cord-020871-1v6dcmt3 89 29 2 2 CD cord-020871-1v6dcmt3 89 30 . . . cord-020871-1v6dcmt3 90 1 We -PRON- PRP cord-020871-1v6dcmt3 90 2 used use VBD cord-020871-1v6dcmt3 90 3 both both CC cord-020871-1v6dcmt3 90 4 d d NN cord-020871-1v6dcmt3 90 5 = = SYM cord-020871-1v6dcmt3 90 6 20 20 CD cord-020871-1v6dcmt3 90 7 and and CC cord-020871-1v6dcmt3 90 8 50-dimensional 50-dimensional CD cord-020871-1v6dcmt3 90 9 feature feature NN cord-020871-1v6dcmt3 90 10 vectors vector NNS cord-020871-1v6dcmt3 90 11 . . . cord-020871-1v6dcmt3 91 1 Note note VB cord-020871-1v6dcmt3 91 2 that that IN cord-020871-1v6dcmt3 91 3 the the DT cord-020871-1v6dcmt3 91 4 evaluation evaluation NN cord-020871-1v6dcmt3 91 5 of of IN cord-020871-1v6dcmt3 91 6 the the DT cord-020871-1v6dcmt3 91 7 clustering clustering NN cord-020871-1v6dcmt3 91 8 algorithms algorithm NNS cord-020871-1v6dcmt3 91 9 is be VBZ cord-020871-1v6dcmt3 91 10 based base VBN cord-020871-1v6dcmt3 91 11 on on IN cord-020871-1v6dcmt3 91 12 the the DT cord-020871-1v6dcmt3 91 13 knowledge knowledge NN cord-020871-1v6dcmt3 91 14 of of IN cord-020871-1v6dcmt3 91 15 the the DT cord-020871-1v6dcmt3 91 16 ground ground NN cord-020871-1v6dcmt3 91 17 truth truth NN cord-020871-1v6dcmt3 91 18 class class NN cord-020871-1v6dcmt3 91 19 assignments assignment NNS cord-020871-1v6dcmt3 91 20 , , , cord-020871-1v6dcmt3 91 21 available available JJ cord-020871-1v6dcmt3 91 22 in in IN cord-020871-1v6dcmt3 91 23 the the DT cord-020871-1v6dcmt3 91 24 20 20 CD cord-020871-1v6dcmt3 91 25 Newsgroups newsgroup NNS cord-020871-1v6dcmt3 91 26 dataset dataset NN cord-020871-1v6dcmt3 91 27 . . . cord-020871-1v6dcmt3 92 1 As as IN cord-020871-1v6dcmt3 92 2 opposed oppose VBN cord-020871-1v6dcmt3 92 3 to to IN cord-020871-1v6dcmt3 92 4 classification classification NN cord-020871-1v6dcmt3 92 5 , , , cord-020871-1v6dcmt3 92 6 clustering clustering NN cord-020871-1v6dcmt3 92 7 experiments experiment NNS cord-020871-1v6dcmt3 92 8 show show VBP cord-020871-1v6dcmt3 92 9 a a DT cord-020871-1v6dcmt3 92 10 generous generous JJ cord-020871-1v6dcmt3 92 11 imbalance imbalance NN cord-020871-1v6dcmt3 92 12 in in IN cord-020871-1v6dcmt3 92 13 performance performance NN cord-020871-1v6dcmt3 92 14 and and CC cord-020871-1v6dcmt3 92 15 firmly firmly RB cord-020871-1v6dcmt3 92 16 speak speak VBP cord-020871-1v6dcmt3 92 17 in in IN cord-020871-1v6dcmt3 92 18 favour favour NN cord-020871-1v6dcmt3 92 19 of of IN cord-020871-1v6dcmt3 92 20 PV pv NN cord-020871-1v6dcmt3 92 21 - - HYPH cord-020871-1v6dcmt3 92 22 DBOW DBOW NNP cord-020871-1v6dcmt3 92 23 . . . cord-020871-1v6dcmt3 93 1 Interestingly interestingly RB cord-020871-1v6dcmt3 93 2 , , , cord-020871-1v6dcmt3 93 3 TF TF NNP cord-020871-1v6dcmt3 93 4 - - HYPH cord-020871-1v6dcmt3 93 5 IDF IDF NNP cord-020871-1v6dcmt3 93 6 , , , cord-020871-1v6dcmt3 93 7 FV FV NNP cord-020871-1v6dcmt3 93 8 - - HYPH cord-020871-1v6dcmt3 93 9 GMM GMM NNP cord-020871-1v6dcmt3 93 10 , , , cord-020871-1v6dcmt3 93 11 and and CC cord-020871-1v6dcmt3 93 12 FV fv NN cord-020871-1v6dcmt3 93 13 - - HYPH cord-020871-1v6dcmt3 93 14 moVMF movmf NN cord-020871-1v6dcmt3 93 15 , , , cord-020871-1v6dcmt3 93 16 all all DT cord-020871-1v6dcmt3 93 17 providing provide VBG cord-020871-1v6dcmt3 93 18 high high JJ cord-020871-1v6dcmt3 93 19 - - HYPH cord-020871-1v6dcmt3 93 20 dimensional dimensional JJ cord-020871-1v6dcmt3 93 21 document document NN cord-020871-1v6dcmt3 93 22 representations representation NNS cord-020871-1v6dcmt3 93 23 , , , cord-020871-1v6dcmt3 93 24 have have VBP cord-020871-1v6dcmt3 93 25 a a DT cord-020871-1v6dcmt3 93 26 low low JJ cord-020871-1v6dcmt3 93 27 clustering clustering NN cord-020871-1v6dcmt3 93 28 effectiveness effectiveness NN cord-020871-1v6dcmt3 93 29 . . . cord-020871-1v6dcmt3 94 1 LSI LSI NNP cord-020871-1v6dcmt3 94 2 and and CC cord-020871-1v6dcmt3 94 3 LDA LDA NNP cord-020871-1v6dcmt3 94 4 achieve achieve VB cord-020871-1v6dcmt3 94 5 low low JJ cord-020871-1v6dcmt3 94 6 accuracy accuracy NN cord-020871-1v6dcmt3 94 7 ( ( -LRB- cord-020871-1v6dcmt3 94 8 see see VB cord-020871-1v6dcmt3 94 9 Table table NN cord-020871-1v6dcmt3 94 10 1 1 CD cord-020871-1v6dcmt3 94 11 ) ) -RRB- cord-020871-1v6dcmt3 94 12 and and CC cord-020871-1v6dcmt3 94 13 are be VBP cord-020871-1v6dcmt3 94 14 omitted omit VBN cord-020871-1v6dcmt3 94 15 here here RB cord-020871-1v6dcmt3 94 16 for for IN cord-020871-1v6dcmt3 94 17 visibility visibility NN cord-020871-1v6dcmt3 94 18 . . . cord-020871-1v6dcmt3 95 1 The the DT cord-020871-1v6dcmt3 95 2 left left JJ cord-020871-1v6dcmt3 95 3 panels panel NNS cord-020871-1v6dcmt3 95 4 [ [ -LRB- cord-020871-1v6dcmt3 95 5 ( ( -LRB- cord-020871-1v6dcmt3 95 6 a a LS cord-020871-1v6dcmt3 95 7 ) ) -RRB- cord-020871-1v6dcmt3 95 8 and and CC cord-020871-1v6dcmt3 95 9 ( ( -LRB- cord-020871-1v6dcmt3 95 10 b b NN cord-020871-1v6dcmt3 95 11 ) ) -RRB- cord-020871-1v6dcmt3 95 12 ] ] -RRB- cord-020871-1v6dcmt3 95 13 show show VBP cord-020871-1v6dcmt3 95 14 the the DT cord-020871-1v6dcmt3 95 15 effect effect NN cord-020871-1v6dcmt3 95 16 of of IN cord-020871-1v6dcmt3 95 17 ( ( -LRB- cord-020871-1v6dcmt3 95 18 inverse inverse NN cord-020871-1v6dcmt3 95 19 ) ) -RRB- cord-020871-1v6dcmt3 95 20 regularisation regularisation NN cord-020871-1v6dcmt3 95 21 of of IN cord-020871-1v6dcmt3 95 22 the the DT cord-020871-1v6dcmt3 95 23 logistic logistic JJ cord-020871-1v6dcmt3 95 24 regression regression NN cord-020871-1v6dcmt3 95 25 classifier classifier NN cord-020871-1v6dcmt3 95 26 on on IN cord-020871-1v6dcmt3 95 27 the the DT cord-020871-1v6dcmt3 95 28 accuracy accuracy NN cord-020871-1v6dcmt3 95 29 , , , cord-020871-1v6dcmt3 95 30 while while IN cord-020871-1v6dcmt3 95 31 the the DT cord-020871-1v6dcmt3 95 32 right right JJ cord-020871-1v6dcmt3 95 33 panels panel NNS cord-020871-1v6dcmt3 95 34 [ [ -LRB- cord-020871-1v6dcmt3 95 35 ( ( -LRB- cord-020871-1v6dcmt3 95 36 c c NN cord-020871-1v6dcmt3 95 37 ) ) -RRB- cord-020871-1v6dcmt3 95 38 and and CC cord-020871-1v6dcmt3 95 39 ( ( -LRB- cord-020871-1v6dcmt3 95 40 d d NN cord-020871-1v6dcmt3 95 41 ) ) -RRB- cord-020871-1v6dcmt3 95 42 ] ] -RRB- cord-020871-1v6dcmt3 95 43 display display VB cord-020871-1v6dcmt3 95 44 the the DT cord-020871-1v6dcmt3 95 45 effect effect NN cord-020871-1v6dcmt3 95 46 of of IN cord-020871-1v6dcmt3 95 47 training training NN cord-020871-1v6dcmt3 95 48 for for IN cord-020871-1v6dcmt3 95 49 the the DT cord-020871-1v6dcmt3 95 50 learning learn VBG cord-020871-1v6dcmt3 95 51 algorithms algorithm NNS cord-020871-1v6dcmt3 95 52 . . . cord-020871-1v6dcmt3 96 1 The the DT cord-020871-1v6dcmt3 96 2 two two CD cord-020871-1v6dcmt3 96 3 symbols symbol NNS cord-020871-1v6dcmt3 96 4 on on IN cord-020871-1v6dcmt3 96 5 the the DT cord-020871-1v6dcmt3 96 6 right right JJ cord-020871-1v6dcmt3 96 7 axis axis NN cord-020871-1v6dcmt3 96 8 in in IN cord-020871-1v6dcmt3 96 9 panels panel NNS cord-020871-1v6dcmt3 96 10 ( ( -LRB- cord-020871-1v6dcmt3 96 11 a a LS cord-020871-1v6dcmt3 96 12 ) ) -RRB- cord-020871-1v6dcmt3 96 13 and and CC cord-020871-1v6dcmt3 96 14 ( ( -LRB- cord-020871-1v6dcmt3 96 15 b b LS cord-020871-1v6dcmt3 96 16 ) ) -RRB- cord-020871-1v6dcmt3 96 17 indicate indicate VBP cord-020871-1v6dcmt3 96 18 the the DT cord-020871-1v6dcmt3 96 19 best good JJS cord-020871-1v6dcmt3 96 20 ( ( -LRB- cord-020871-1v6dcmt3 96 21 FV fv NN cord-020871-1v6dcmt3 96 22 - - HYPH cord-020871-1v6dcmt3 96 23 moVMF movmf NN cord-020871-1v6dcmt3 96 24 ) ) -RRB- cord-020871-1v6dcmt3 96 25 results result NNS cord-020871-1v6dcmt3 96 26 reported report VBN cord-020871-1v6dcmt3 96 27 in in IN cord-020871-1v6dcmt3 96 28 [ [ -LRB- cord-020871-1v6dcmt3 96 29 14 14 CD cord-020871-1v6dcmt3 96 30 ] ] -RRB- cord-020871-1v6dcmt3 96 31 . . . cord-020871-1v6dcmt3 96 32 _SP cord-020871-1v6dcmt3 97 1 For for IN cord-020871-1v6dcmt3 97 2 these these DT cord-020871-1v6dcmt3 97 3 experiments experiment NNS cord-020871-1v6dcmt3 97 4 , , , cord-020871-1v6dcmt3 97 5 we -PRON- PRP cord-020871-1v6dcmt3 97 6 extracted extract VBD cord-020871-1v6dcmt3 97 7 from from IN cord-020871-1v6dcmt3 97 8 every every DT cord-020871-1v6dcmt3 97 9 document document NN cord-020871-1v6dcmt3 97 10 of of IN cord-020871-1v6dcmt3 97 11 the the DT cord-020871-1v6dcmt3 97 12 test test NN cord-020871-1v6dcmt3 97 13 collection collection NN cord-020871-1v6dcmt3 97 14 all all PDT cord-020871-1v6dcmt3 97 15 the the DT cord-020871-1v6dcmt3 97 16 raw raw JJ cord-020871-1v6dcmt3 97 17 text text NN cord-020871-1v6dcmt3 97 18 , , , cord-020871-1v6dcmt3 97 19 and and CC cord-020871-1v6dcmt3 97 20 preprocessed preprocesse VBD cord-020871-1v6dcmt3 97 21 it -PRON- PRP cord-020871-1v6dcmt3 97 22 as as IN cord-020871-1v6dcmt3 97 23 described describe VBN cord-020871-1v6dcmt3 97 24 in in IN cord-020871-1v6dcmt3 97 25 the the DT cord-020871-1v6dcmt3 97 26 beginning beginning NN cord-020871-1v6dcmt3 97 27 of of IN cord-020871-1v6dcmt3 97 28 this this DT cord-020871-1v6dcmt3 97 29 section section NN cord-020871-1v6dcmt3 97 30 . . . cord-020871-1v6dcmt3 98 1 The the DT cord-020871-1v6dcmt3 98 2 documents document NNS cord-020871-1v6dcmt3 98 3 were be VBD cord-020871-1v6dcmt3 98 4 indexed index VBN cord-020871-1v6dcmt3 98 5 and and CC cord-020871-1v6dcmt3 98 6 retrieved retrieve VBN cord-020871-1v6dcmt3 98 7 for for IN cord-020871-1v6dcmt3 98 8 BM25 BM25 NNP cord-020871-1v6dcmt3 98 9 with with IN cord-020871-1v6dcmt3 98 10 the the DT cord-020871-1v6dcmt3 98 11 Lucene Lucene NNP cord-020871-1v6dcmt3 98 12 8.2 8.2 CD cord-020871-1v6dcmt3 98 13 search search NN cord-020871-1v6dcmt3 98 14 engine engine NN cord-020871-1v6dcmt3 98 15 . . . cord-020871-1v6dcmt3 99 1 We -PRON- PRP cord-020871-1v6dcmt3 99 2 experimented experiment VBD cord-020871-1v6dcmt3 99 3 with with IN cord-020871-1v6dcmt3 99 4 three three CD cord-020871-1v6dcmt3 99 5 topic topic NN cord-020871-1v6dcmt3 99 6 processing processing NN cord-020871-1v6dcmt3 99 7 ways way NNS cord-020871-1v6dcmt3 99 8 : : : cord-020871-1v6dcmt3 99 9 ( ( -LRB- cord-020871-1v6dcmt3 99 10 1 1 LS cord-020871-1v6dcmt3 99 11 ) ) -RRB- cord-020871-1v6dcmt3 99 12 title title NN cord-020871-1v6dcmt3 99 13 only only RB cord-020871-1v6dcmt3 99 14 , , , cord-020871-1v6dcmt3 99 15 ( ( -LRB- cord-020871-1v6dcmt3 99 16 2 2 LS cord-020871-1v6dcmt3 99 17 ) ) -RRB- cord-020871-1v6dcmt3 99 18 description description NN cord-020871-1v6dcmt3 99 19 only only RB cord-020871-1v6dcmt3 99 20 , , , cord-020871-1v6dcmt3 99 21 and and CC cord-020871-1v6dcmt3 99 22 ( ( -LRB- cord-020871-1v6dcmt3 99 23 3 3 LS cord-020871-1v6dcmt3 99 24 ) ) -RRB- cord-020871-1v6dcmt3 99 25 title title NN cord-020871-1v6dcmt3 99 26 and and CC cord-020871-1v6dcmt3 99 27 description description NN cord-020871-1v6dcmt3 99 28 . . . cord-020871-1v6dcmt3 100 1 The the DT cord-020871-1v6dcmt3 100 2 third third JJ cord-020871-1v6dcmt3 100 3 way way NN cord-020871-1v6dcmt3 100 4 produces produce VBZ cord-020871-1v6dcmt3 100 5 the the DT cord-020871-1v6dcmt3 100 6 best good JJS cord-020871-1v6dcmt3 100 7 results result NNS cord-020871-1v6dcmt3 100 8 and and CC cord-020871-1v6dcmt3 100 9 closest close JJS cord-020871-1v6dcmt3 100 10 to to IN cord-020871-1v6dcmt3 100 11 the the DT cord-020871-1v6dcmt3 100 12 ones one NNS cord-020871-1v6dcmt3 100 13 reported report VBN cord-020871-1v6dcmt3 100 14 by by IN cord-020871-1v6dcmt3 100 15 Zhang Zhang NNP cord-020871-1v6dcmt3 100 16 et et NNP cord-020871-1v6dcmt3 100 17 al al NNP cord-020871-1v6dcmt3 100 18 . . . cord-020871-1v6dcmt3 101 1 [ [ -LRB- cord-020871-1v6dcmt3 101 2 14 14 CD cord-020871-1v6dcmt3 101 3 ] ] -RRB- cord-020871-1v6dcmt3 101 4 , , , cord-020871-1v6dcmt3 101 5 and and CC cord-020871-1v6dcmt3 101 6 hence hence RB cord-020871-1v6dcmt3 101 7 are be VBP cord-020871-1v6dcmt3 101 8 the the DT cord-020871-1v6dcmt3 101 9 only only JJ cord-020871-1v6dcmt3 101 10 ones one NNS cord-020871-1v6dcmt3 101 11 reported report VBN cord-020871-1v6dcmt3 101 12 here here RB cord-020871-1v6dcmt3 101 13 . . . cord-020871-1v6dcmt3 102 1 An an DT cord-020871-1v6dcmt3 102 2 important important JJ cord-020871-1v6dcmt3 102 3 aspect aspect NN cord-020871-1v6dcmt3 102 4 of of IN cord-020871-1v6dcmt3 102 5 BM25 BM25 NNP cord-020871-1v6dcmt3 102 6 is be VBZ cord-020871-1v6dcmt3 102 7 the the DT cord-020871-1v6dcmt3 102 8 fact fact NN cord-020871-1v6dcmt3 102 9 that that IN cord-020871-1v6dcmt3 102 10 the the DT cord-020871-1v6dcmt3 102 11 variation variation NN cord-020871-1v6dcmt3 102 12 of of IN cord-020871-1v6dcmt3 102 13 its -PRON- PRP$ cord-020871-1v6dcmt3 102 14 parameters parameter NNS cord-020871-1v6dcmt3 102 15 k k NNP cord-020871-1v6dcmt3 102 16 1 1 CD cord-020871-1v6dcmt3 102 17 and and CC cord-020871-1v6dcmt3 102 18 b b LS cord-020871-1v6dcmt3 102 19 could could MD cord-020871-1v6dcmt3 102 20 bring bring VB cord-020871-1v6dcmt3 102 21 significant significant JJ cord-020871-1v6dcmt3 102 22 improvement improvement NN cord-020871-1v6dcmt3 102 23 in in IN cord-020871-1v6dcmt3 102 24 performance performance NN cord-020871-1v6dcmt3 102 25 , , , cord-020871-1v6dcmt3 102 26 as as IN cord-020871-1v6dcmt3 102 27 reported report VBN cord-020871-1v6dcmt3 102 28 by by IN cord-020871-1v6dcmt3 102 29 Lipani Lipani NNP cord-020871-1v6dcmt3 102 30 et et NNP cord-020871-1v6dcmt3 102 31 al al NNP cord-020871-1v6dcmt3 102 32 . . . cord-020871-1v6dcmt3 103 1 [ [ -LRB- cord-020871-1v6dcmt3 103 2 8 8 CD cord-020871-1v6dcmt3 103 3 ] ] -RRB- cord-020871-1v6dcmt3 103 4 . . . cord-020871-1v6dcmt3 104 1 Therefore therefore RB cord-020871-1v6dcmt3 104 2 , , , cord-020871-1v6dcmt3 104 3 we -PRON- PRP cord-020871-1v6dcmt3 104 4 performed perform VBD cord-020871-1v6dcmt3 104 5 a a DT cord-020871-1v6dcmt3 104 6 parameter parameter NN cord-020871-1v6dcmt3 104 7 scan scan NN cord-020871-1v6dcmt3 104 8 for for IN cord-020871-1v6dcmt3 104 9 k k NN cord-020871-1v6dcmt3 104 10 1 1 CD cord-020871-1v6dcmt3 104 11 ∈ ∈ NN cord-020871-1v6dcmt3 104 12 [ [ -LRB- cord-020871-1v6dcmt3 104 13 0 0 CD cord-020871-1v6dcmt3 104 14 , , , cord-020871-1v6dcmt3 104 15 3 3 CD cord-020871-1v6dcmt3 104 16 ] ] -RRB- cord-020871-1v6dcmt3 104 17 and and CC cord-020871-1v6dcmt3 104 18 b b LS cord-020871-1v6dcmt3 104 19 ∈ ∈ VB cord-020871-1v6dcmt3 104 20 [ [ -LRB- cord-020871-1v6dcmt3 104 21 0 0 CD cord-020871-1v6dcmt3 104 22 , , , cord-020871-1v6dcmt3 104 23 1 1 CD cord-020871-1v6dcmt3 104 24 ] ] -RRB- cord-020871-1v6dcmt3 104 25 with with IN cord-020871-1v6dcmt3 104 26 a a DT cord-020871-1v6dcmt3 104 27 0.05 0.05 CD cord-020871-1v6dcmt3 104 28 step step NN cord-020871-1v6dcmt3 104 29 size size NN cord-020871-1v6dcmt3 104 30 for for IN cord-020871-1v6dcmt3 104 31 both both DT cord-020871-1v6dcmt3 104 32 parameters parameter NNS cord-020871-1v6dcmt3 104 33 . . . cord-020871-1v6dcmt3 105 1 For for IN cord-020871-1v6dcmt3 105 2 every every DT cord-020871-1v6dcmt3 105 3 TREC TREC NNP cord-020871-1v6dcmt3 105 4 topic topic NN cord-020871-1v6dcmt3 105 5 , , , cord-020871-1v6dcmt3 105 6 the the DT cord-020871-1v6dcmt3 105 7 scores score NNS cord-020871-1v6dcmt3 105 8 of of IN cord-020871-1v6dcmt3 105 9 the the DT cord-020871-1v6dcmt3 105 10 top top JJ cord-020871-1v6dcmt3 105 11 1000 1000 CD cord-020871-1v6dcmt3 105 12 documents document NNS cord-020871-1v6dcmt3 105 13 retrieved retrieve VBN cord-020871-1v6dcmt3 105 14 from from IN cord-020871-1v6dcmt3 105 15 BM25 BM25 NNP cord-020871-1v6dcmt3 105 16 were be VBD cord-020871-1v6dcmt3 105 17 normalised normalise VBN cord-020871-1v6dcmt3 105 18 to to IN cord-020871-1v6dcmt3 105 19 [ [ -LRB- cord-020871-1v6dcmt3 105 20 0,1 0,1 CD cord-020871-1v6dcmt3 105 21 ] ] -RRB- cord-020871-1v6dcmt3 105 22 with with IN cord-020871-1v6dcmt3 105 23 the the DT cord-020871-1v6dcmt3 105 24 min min NNP cord-020871-1v6dcmt3 105 25 - - HYPH cord-020871-1v6dcmt3 105 26 max max NN cord-020871-1v6dcmt3 105 27 normalisation normalisation NN cord-020871-1v6dcmt3 105 28 method method NN cord-020871-1v6dcmt3 105 29 , , , cord-020871-1v6dcmt3 105 30 and and CC cord-020871-1v6dcmt3 105 31 were be VBD cord-020871-1v6dcmt3 105 32 used use VBN cord-020871-1v6dcmt3 105 33 in in IN cord-020871-1v6dcmt3 105 34 calculating calculate VBG cord-020871-1v6dcmt3 105 35 the the DT cord-020871-1v6dcmt3 105 36 scores score NNS cord-020871-1v6dcmt3 105 37 of of IN cord-020871-1v6dcmt3 105 38 the the DT cord-020871-1v6dcmt3 105 39 documents document NNS cord-020871-1v6dcmt3 105 40 for for IN cord-020871-1v6dcmt3 105 41 the the DT cord-020871-1v6dcmt3 105 42 combined combined JJ cord-020871-1v6dcmt3 105 43 models model NNS cord-020871-1v6dcmt3 105 44 [ [ -LRB- cord-020871-1v6dcmt3 105 45 14 14 CD cord-020871-1v6dcmt3 105 46 ] ] -RRB- cord-020871-1v6dcmt3 105 47 . . . cord-020871-1v6dcmt3 106 1 The the DT cord-020871-1v6dcmt3 106 2 original original JJ cord-020871-1v6dcmt3 106 3 results result NNS cord-020871-1v6dcmt3 106 4 , , , cord-020871-1v6dcmt3 106 5 those those DT cord-020871-1v6dcmt3 106 6 of of IN cord-020871-1v6dcmt3 106 7 our -PRON- PRP$ cord-020871-1v6dcmt3 106 8 replication replication NN cord-020871-1v6dcmt3 106 9 experiments experiment NNS cord-020871-1v6dcmt3 106 10 with with IN cord-020871-1v6dcmt3 106 11 standard standard NN cord-020871-1v6dcmt3 106 12 ( ( -LRB- cord-020871-1v6dcmt3 106 13 k k NN cord-020871-1v6dcmt3 106 14 1 1 CD cord-020871-1v6dcmt3 106 15 = = SYM cord-020871-1v6dcmt3 106 16 1.2 1.2 CD cord-020871-1v6dcmt3 106 17 and and CC cord-020871-1v6dcmt3 106 18 b b NN cord-020871-1v6dcmt3 106 19 = = SYM cord-020871-1v6dcmt3 106 20 0.75 0.75 CD cord-020871-1v6dcmt3 106 21 ) ) -RRB- cord-020871-1v6dcmt3 106 22 and and CC cord-020871-1v6dcmt3 106 23 best good JJS cord-020871-1v6dcmt3 106 24 BM25 BM25 NNP cord-020871-1v6dcmt3 106 25 parameter parameter NN cord-020871-1v6dcmt3 106 26 values value NNS cord-020871-1v6dcmt3 106 27 - - HYPH cord-020871-1v6dcmt3 106 28 measured measure VBN cord-020871-1v6dcmt3 106 29 in in IN cord-020871-1v6dcmt3 106 30 terms term NNS cord-020871-1v6dcmt3 106 31 of of IN cord-020871-1v6dcmt3 106 32 Mean mean JJ cord-020871-1v6dcmt3 106 33 Average Average NNP cord-020871-1v6dcmt3 106 34 Precision Precision NNP cord-020871-1v6dcmt3 106 35 ( ( -LRB- cord-020871-1v6dcmt3 106 36 MAP MAP NNP cord-020871-1v6dcmt3 106 37 ) ) -RRB- cord-020871-1v6dcmt3 106 38 and and CC cord-020871-1v6dcmt3 106 39 Precision Precision NNP cord-020871-1v6dcmt3 106 40 at at IN cord-020871-1v6dcmt3 106 41 20 20 CD cord-020871-1v6dcmt3 107 1 ( ( -LRB- cord-020871-1v6dcmt3 107 2 P@20)-are p@20)-are NN cord-020871-1v6dcmt3 107 3 outlined outline VBN cord-020871-1v6dcmt3 107 4 in in IN cord-020871-1v6dcmt3 107 5 Table Table NNP cord-020871-1v6dcmt3 107 6 3 3 CD cord-020871-1v6dcmt3 107 7 . . . cord-020871-1v6dcmt3 108 1 We -PRON- PRP cord-020871-1v6dcmt3 108 2 replicated replicate VBD cord-020871-1v6dcmt3 108 3 previously previously RB cord-020871-1v6dcmt3 108 4 reported report VBD cord-020871-1v6dcmt3 108 5 experiments experiment NNS cord-020871-1v6dcmt3 108 6 that that WDT cord-020871-1v6dcmt3 108 7 presented present VBD cord-020871-1v6dcmt3 108 8 evidence evidence NN cord-020871-1v6dcmt3 108 9 that that IN cord-020871-1v6dcmt3 108 10 a a DT cord-020871-1v6dcmt3 108 11 new new JJ cord-020871-1v6dcmt3 108 12 mixture mixture NN cord-020871-1v6dcmt3 108 13 model model NN cord-020871-1v6dcmt3 108 14 , , , cord-020871-1v6dcmt3 108 15 based base VBN cord-020871-1v6dcmt3 108 16 on on IN cord-020871-1v6dcmt3 108 17 von von NNP cord-020871-1v6dcmt3 108 18 Mises Mises NNP cord-020871-1v6dcmt3 108 19 - - HYPH cord-020871-1v6dcmt3 108 20 Fisher Fisher NNP cord-020871-1v6dcmt3 108 21 distributions distribution NNS cord-020871-1v6dcmt3 108 22 , , , cord-020871-1v6dcmt3 108 23 outperformed outperform VBD cord-020871-1v6dcmt3 108 24 a a DT cord-020871-1v6dcmt3 108 25 series series NN cord-020871-1v6dcmt3 108 26 of of IN cord-020871-1v6dcmt3 108 27 other other JJ cord-020871-1v6dcmt3 108 28 models model NNS cord-020871-1v6dcmt3 108 29 in in IN cord-020871-1v6dcmt3 108 30 three three CD cord-020871-1v6dcmt3 108 31 tasks task NNS cord-020871-1v6dcmt3 108 32 ( ( -LRB- cord-020871-1v6dcmt3 108 33 classification classification NN cord-020871-1v6dcmt3 108 34 , , , cord-020871-1v6dcmt3 108 35 clustering clustering NN cord-020871-1v6dcmt3 108 36 , , , cord-020871-1v6dcmt3 108 37 and and CC cord-020871-1v6dcmt3 108 38 retrievalwhen retrievalwhen NNP cord-020871-1v6dcmt3 108 39 combined combine VBN cord-020871-1v6dcmt3 108 40 with with IN cord-020871-1v6dcmt3 108 41 standard standard JJ cord-020871-1v6dcmt3 108 42 retrieval retrieval NN cord-020871-1v6dcmt3 108 43 models model NNS cord-020871-1v6dcmt3 108 44 ) ) -RRB- cord-020871-1v6dcmt3 108 45 . . . cord-020871-1v6dcmt3 109 1 Since since IN cord-020871-1v6dcmt3 109 2 the the DT cord-020871-1v6dcmt3 109 3 source source NN cord-020871-1v6dcmt3 109 4 code code NN cord-020871-1v6dcmt3 109 5 was be VBD cord-020871-1v6dcmt3 109 6 not not RB cord-020871-1v6dcmt3 109 7 released release VBN cord-020871-1v6dcmt3 109 8 in in IN cord-020871-1v6dcmt3 109 9 the the DT cord-020871-1v6dcmt3 109 10 original original JJ cord-020871-1v6dcmt3 109 11 paper paper NN cord-020871-1v6dcmt3 109 12 , , , cord-020871-1v6dcmt3 109 13 important important JJ cord-020871-1v6dcmt3 109 14 implementation implementation NN cord-020871-1v6dcmt3 109 15 and and CC cord-020871-1v6dcmt3 109 16 formulation formulation NN cord-020871-1v6dcmt3 109 17 details detail NNS cord-020871-1v6dcmt3 109 18 were be VBD cord-020871-1v6dcmt3 109 19 omitted omit VBN cord-020871-1v6dcmt3 109 20 , , , cord-020871-1v6dcmt3 109 21 and and CC cord-020871-1v6dcmt3 109 22 the the DT cord-020871-1v6dcmt3 109 23 authors author NNS cord-020871-1v6dcmt3 109 24 never never RB cord-020871-1v6dcmt3 109 25 replied reply VBD cord-020871-1v6dcmt3 109 26 to to IN cord-020871-1v6dcmt3 109 27 our -PRON- PRP$ cord-020871-1v6dcmt3 109 28 request request NN cord-020871-1v6dcmt3 109 29 for for IN cord-020871-1v6dcmt3 109 30 information information NN cord-020871-1v6dcmt3 109 31 , , , cord-020871-1v6dcmt3 109 32 a a DT cord-020871-1v6dcmt3 109 33 significant significant JJ cord-020871-1v6dcmt3 109 34 effort effort NN cord-020871-1v6dcmt3 109 35 has have VBZ cord-020871-1v6dcmt3 109 36 been be VBN cord-020871-1v6dcmt3 109 37 devoted devote VBN cord-020871-1v6dcmt3 109 38 to to TO cord-020871-1v6dcmt3 109 39 reverse reverse VB cord-020871-1v6dcmt3 109 40 engineer engineer NN cord-020871-1v6dcmt3 109 41 the the DT cord-020871-1v6dcmt3 109 42 experiments experiment NNS cord-020871-1v6dcmt3 109 43 . . . cord-020871-1v6dcmt3 110 1 In in IN cord-020871-1v6dcmt3 110 2 general general JJ cord-020871-1v6dcmt3 110 3 , , , cord-020871-1v6dcmt3 110 4 for for IN cord-020871-1v6dcmt3 110 5 none none NN cord-020871-1v6dcmt3 110 6 of of IN cord-020871-1v6dcmt3 110 7 the the DT cord-020871-1v6dcmt3 110 8 tasks task NNS cord-020871-1v6dcmt3 110 9 were be VBD cord-020871-1v6dcmt3 110 10 we -PRON- PRP cord-020871-1v6dcmt3 110 11 able able JJ cord-020871-1v6dcmt3 110 12 to to TO cord-020871-1v6dcmt3 110 13 confirm confirm VB cord-020871-1v6dcmt3 110 14 the the DT cord-020871-1v6dcmt3 110 15 conclusions conclusion NNS cord-020871-1v6dcmt3 110 16 of of IN cord-020871-1v6dcmt3 110 17 the the DT cord-020871-1v6dcmt3 110 18 previous previous JJ cord-020871-1v6dcmt3 110 19 experiments experiment NNS cord-020871-1v6dcmt3 110 20 : : : cord-020871-1v6dcmt3 110 21 we -PRON- PRP cord-020871-1v6dcmt3 110 22 do do VBP cord-020871-1v6dcmt3 110 23 not not RB cord-020871-1v6dcmt3 110 24 have have VB cord-020871-1v6dcmt3 110 25 enough enough JJ cord-020871-1v6dcmt3 110 26 evidence evidence NN cord-020871-1v6dcmt3 110 27 to to TO cord-020871-1v6dcmt3 110 28 conclude conclude VB cord-020871-1v6dcmt3 110 29 that that IN cord-020871-1v6dcmt3 110 30 FV fv NN cord-020871-1v6dcmt3 110 31 - - HYPH cord-020871-1v6dcmt3 110 32 moVMF movmf NN cord-020871-1v6dcmt3 110 33 outperforms outperform VBZ cord-020871-1v6dcmt3 110 34 the the DT cord-020871-1v6dcmt3 110 35 other other JJ cord-020871-1v6dcmt3 110 36 methods method NNS cord-020871-1v6dcmt3 110 37 . . . cord-020871-1v6dcmt3 111 1 The the DT cord-020871-1v6dcmt3 111 2 situation situation NN cord-020871-1v6dcmt3 111 3 is be VBZ cord-020871-1v6dcmt3 111 4 rather rather RB cord-020871-1v6dcmt3 111 5 different different JJ cord-020871-1v6dcmt3 111 6 when when WRB cord-020871-1v6dcmt3 111 7 considering consider VBG cord-020871-1v6dcmt3 111 8 the the DT cord-020871-1v6dcmt3 111 9 effectiveness effectiveness NN cord-020871-1v6dcmt3 111 10 of of IN cord-020871-1v6dcmt3 111 11 these these DT cord-020871-1v6dcmt3 111 12 document document NN cord-020871-1v6dcmt3 111 13 representations representation NNS cord-020871-1v6dcmt3 111 14 for for IN cord-020871-1v6dcmt3 111 15 clustering clustering NN cord-020871-1v6dcmt3 111 16 purposes purpose NNS cord-020871-1v6dcmt3 111 17 : : : cord-020871-1v6dcmt3 111 18 we -PRON- PRP cord-020871-1v6dcmt3 111 19 find find VBP cord-020871-1v6dcmt3 111 20 indeed indeed RB cord-020871-1v6dcmt3 111 21 that that IN cord-020871-1v6dcmt3 111 22 the the DT cord-020871-1v6dcmt3 111 23 FV FV NNP cord-020871-1v6dcmt3 111 24 - - HYPH cord-020871-1v6dcmt3 111 25 moVMF movmf NN cord-020871-1v6dcmt3 111 26 significantly significantly RB cord-020871-1v6dcmt3 111 27 underperforms underperform VBZ cord-020871-1v6dcmt3 111 28 , , , cord-020871-1v6dcmt3 111 29 contradicting contradict VBG cord-020871-1v6dcmt3 111 30 previous previous JJ cord-020871-1v6dcmt3 111 31 conclusions conclusion NNS cord-020871-1v6dcmt3 111 32 . . . cord-020871-1v6dcmt3 112 1 In in IN cord-020871-1v6dcmt3 112 2 the the DT cord-020871-1v6dcmt3 112 3 case case NN cord-020871-1v6dcmt3 112 4 of of IN cord-020871-1v6dcmt3 112 5 retrieval retrieval NN cord-020871-1v6dcmt3 112 6 , , , cord-020871-1v6dcmt3 112 7 although although IN cord-020871-1v6dcmt3 112 8 Zhang Zhang NNP cord-020871-1v6dcmt3 112 9 et et NNP cord-020871-1v6dcmt3 112 10 al al NNP cord-020871-1v6dcmt3 112 11 . . NNP cord-020871-1v6dcmt3 112 12 's 's POS cord-020871-1v6dcmt3 112 13 proposed propose VBN cord-020871-1v6dcmt3 112 14 method method NN cord-020871-1v6dcmt3 112 15 ( ( -LRB- cord-020871-1v6dcmt3 112 16 FV fv NN cord-020871-1v6dcmt3 112 17 - - HYPH cord-020871-1v6dcmt3 112 18 moVMF movmf NN cord-020871-1v6dcmt3 112 19 ) ) -RRB- cord-020871-1v6dcmt3 112 20 indeed indeed RB cord-020871-1v6dcmt3 112 21 boosts boost VBZ cord-020871-1v6dcmt3 112 22 BM25 BM25 NNP cord-020871-1v6dcmt3 112 23 , , , cord-020871-1v6dcmt3 112 24 it -PRON- PRP cord-020871-1v6dcmt3 112 25 does do VBZ cord-020871-1v6dcmt3 112 26 not not RB cord-020871-1v6dcmt3 112 27 outperform outperform VB cord-020871-1v6dcmt3 112 28 most most JJS cord-020871-1v6dcmt3 112 29 of of IN cord-020871-1v6dcmt3 112 30 the the DT cord-020871-1v6dcmt3 112 31 other other JJ cord-020871-1v6dcmt3 112 32 models model NNS cord-020871-1v6dcmt3 112 33 it -PRON- PRP cord-020871-1v6dcmt3 112 34 was be VBD cord-020871-1v6dcmt3 112 35 compared compare VBN cord-020871-1v6dcmt3 112 36 to to IN cord-020871-1v6dcmt3 112 37 . . . cord-020871-1v6dcmt3 113 1 Clustering cluster VBG cord-020871-1v6dcmt3 113 2 on on IN cord-020871-1v6dcmt3 113 3 the the DT cord-020871-1v6dcmt3 113 4 unit unit NN cord-020871-1v6dcmt3 113 5 hypersphere hypersphere NN cord-020871-1v6dcmt3 113 6 using use VBG cord-020871-1v6dcmt3 113 7 von von NNP cord-020871-1v6dcmt3 113 8 Mises Mises NNP cord-020871-1v6dcmt3 113 9 - - HYPH cord-020871-1v6dcmt3 113 10 Fisher Fisher NNP cord-020871-1v6dcmt3 113 11 distributions distribution NNS cord-020871-1v6dcmt3 113 12 Latent Latent NNP cord-020871-1v6dcmt3 113 13 Dirichlet dirichlet JJ cord-020871-1v6dcmt3 113 14 allocation allocation NN cord-020871-1v6dcmt3 113 15 Aggregating aggregate VBG cord-020871-1v6dcmt3 113 16 continuous continuous JJ cord-020871-1v6dcmt3 113 17 word word NN cord-020871-1v6dcmt3 113 18 embeddings embedding NNS cord-020871-1v6dcmt3 113 19 for for IN cord-020871-1v6dcmt3 113 20 information information NN cord-020871-1v6dcmt3 113 21 retrieval retrieval NN cord-020871-1v6dcmt3 113 22 Indexing indexing NN cord-020871-1v6dcmt3 113 23 by by IN cord-020871-1v6dcmt3 113 24 latent latent JJ cord-020871-1v6dcmt3 113 25 semantic semantic JJ cord-020871-1v6dcmt3 113 26 analysis analysis NN cord-020871-1v6dcmt3 113 27 Distributional distributional JJ cord-020871-1v6dcmt3 113 28 structure structure NN cord-020871-1v6dcmt3 113 29 . . . cord-020871-1v6dcmt3 114 1 Word word NN cord-020871-1v6dcmt3 114 2 Let let VB cord-020871-1v6dcmt3 115 1 's 's POS cord-020871-1v6dcmt3 115 2 measure measure NN cord-020871-1v6dcmt3 115 3 run run VBN cord-020871-1v6dcmt3 115 4 time time NN cord-020871-1v6dcmt3 115 5 ! ! . cord-020871-1v6dcmt3 116 1 Extending extend VBG cord-020871-1v6dcmt3 116 2 the the DT cord-020871-1v6dcmt3 116 3 IR IR NNP cord-020871-1v6dcmt3 116 4 replicability replicability NN cord-020871-1v6dcmt3 116 5 infrastructure infrastructure NN cord-020871-1v6dcmt3 116 6 to to TO cord-020871-1v6dcmt3 116 7 include include VB cord-020871-1v6dcmt3 116 8 performance performance NN cord-020871-1v6dcmt3 116 9 aspects aspect NNS cord-020871-1v6dcmt3 117 1 Distributed distribute VBN cord-020871-1v6dcmt3 117 2 representations representation NNS cord-020871-1v6dcmt3 117 3 of of IN cord-020871-1v6dcmt3 117 4 sentences sentence NNS cord-020871-1v6dcmt3 117 5 and and CC cord-020871-1v6dcmt3 117 6 documents document NNS cord-020871-1v6dcmt3 117 7 Verboseness verboseness JJ cord-020871-1v6dcmt3 117 8 fission fission NN cord-020871-1v6dcmt3 117 9 for for IN cord-020871-1v6dcmt3 118 1 BM25 BM25 NNP cord-020871-1v6dcmt3 118 2 document document NN cord-020871-1v6dcmt3 118 3 length length NN cord-020871-1v6dcmt3 118 4 normalization normalization NN cord-020871-1v6dcmt3 119 1 Efficient efficient JJ cord-020871-1v6dcmt3 119 2 estimation estimation NN cord-020871-1v6dcmt3 119 3 of of IN cord-020871-1v6dcmt3 119 4 word word NN cord-020871-1v6dcmt3 119 5 representations representation NNS cord-020871-1v6dcmt3 119 6 in in IN cord-020871-1v6dcmt3 119 7 vector vector NN cord-020871-1v6dcmt3 119 8 space space NN cord-020871-1v6dcmt3 119 9 Neural Neural NNP cord-020871-1v6dcmt3 119 10 information information NN cord-020871-1v6dcmt3 119 11 retrieval retrieval NN cord-020871-1v6dcmt3 119 12 : : : cord-020871-1v6dcmt3 119 13 at at IN cord-020871-1v6dcmt3 119 14 the the DT cord-020871-1v6dcmt3 119 15 end end NN cord-020871-1v6dcmt3 119 16 of of IN cord-020871-1v6dcmt3 119 17 the the DT cord-020871-1v6dcmt3 119 18 early early JJ cord-020871-1v6dcmt3 119 19 years year NNS cord-020871-1v6dcmt3 119 20 A a DT cord-020871-1v6dcmt3 119 21 sentimental sentimental JJ cord-020871-1v6dcmt3 119 22 education education NN cord-020871-1v6dcmt3 119 23 : : : cord-020871-1v6dcmt3 119 24 sentiment sentiment NN cord-020871-1v6dcmt3 119 25 analysis analysis NN cord-020871-1v6dcmt3 119 26 using use VBG cord-020871-1v6dcmt3 119 27 subjectivity subjectivity NN cord-020871-1v6dcmt3 119 28 summarization summarization NN cord-020871-1v6dcmt3 119 29 based base VBN cord-020871-1v6dcmt3 119 30 on on IN cord-020871-1v6dcmt3 119 31 minimum minimum JJ cord-020871-1v6dcmt3 119 32 cuts cut NNS cord-020871-1v6dcmt3 119 33 Seeing see VBG cord-020871-1v6dcmt3 119 34 stars star NNS cord-020871-1v6dcmt3 119 35 : : : cord-020871-1v6dcmt3 119 36 exploiting exploit VBG cord-020871-1v6dcmt3 119 37 class class NN cord-020871-1v6dcmt3 119 38 relationships relationship NNS cord-020871-1v6dcmt3 119 39 for for IN cord-020871-1v6dcmt3 119 40 sentiment sentiment NN cord-020871-1v6dcmt3 119 41 categorization categorization NN cord-020871-1v6dcmt3 119 42 with with IN cord-020871-1v6dcmt3 119 43 respect respect NN cord-020871-1v6dcmt3 119 44 to to IN cord-020871-1v6dcmt3 119 45 rating rating NN cord-020871-1v6dcmt3 119 46 scales scale NNS cord-020871-1v6dcmt3 120 1 The the DT cord-020871-1v6dcmt3 120 2 TREC TREC NNP cord-020871-1v6dcmt3 120 3 robust robust JJ cord-020871-1v6dcmt3 120 4 retrieval retrieval NN cord-020871-1v6dcmt3 120 5 track track NN cord-020871-1v6dcmt3 120 6 . . . cord-020871-1v6dcmt3 121 1 SIGIR SIGIR NNP cord-020871-1v6dcmt3 121 2 Forum Forum NNP cord-020871-1v6dcmt3 122 1 Aggregating aggregate VBG cord-020871-1v6dcmt3 122 2 neural neural JJ cord-020871-1v6dcmt3 122 3 word word NN cord-020871-1v6dcmt3 122 4 embeddings embedding NNS cord-020871-1v6dcmt3 122 5 for for IN cord-020871-1v6dcmt3 122 6 document document NN cord-020871-1v6dcmt3 122 7 representation representation NN cord-020871-1v6dcmt3 122 8 EARS ear VBZ cord-020871-1v6dcmt3 122 9 2019 2019 CD cord-020871-1v6dcmt3 122 10 : : : cord-020871-1v6dcmt3 123 1 the the DT cord-020871-1v6dcmt3 123 2 2nd 2nd JJ cord-020871-1v6dcmt3 123 3 international international JJ cord-020871-1v6dcmt3 123 4 workshop workshop NN cord-020871-1v6dcmt3 123 5 on on IN cord-020871-1v6dcmt3 123 6 explainable explainable JJ cord-020871-1v6dcmt3 123 7 recommendation recommendation NN cord-020871-1v6dcmt3 123 8 and and CC cord-020871-1v6dcmt3 123 9 search search NN cord-020871-1v6dcmt3 123 10 Authors author NNS cord-020871-1v6dcmt3 123 11 are be VBP cord-020871-1v6dcmt3 123 12 partially partially RB cord-020871-1v6dcmt3 123 13 supported support VBN cord-020871-1v6dcmt3 123 14 by by IN cord-020871-1v6dcmt3 123 15 the the DT cord-020871-1v6dcmt3 123 16 H2020 H2020 NNP cord-020871-1v6dcmt3 123 17 Safe Safe NNP cord-020871-1v6dcmt3 123 18 - - HYPH cord-020871-1v6dcmt3 123 19 DEED deed NN cord-020871-1v6dcmt3 123 20 project project NN cord-020871-1v6dcmt3 123 21 ( ( -LRB- cord-020871-1v6dcmt3 123 22 GA GA NNP cord-020871-1v6dcmt3 123 23 825225 825225 CD cord-020871-1v6dcmt3 123 24 ) ) -RRB- cord-020871-1v6dcmt3 123 25 . . .