id author title date pages extension mime words sentences flesch summary cache txt work_n2ajueo2ozfp5hcybnms5ep22a Wei Xu Extracting Lexically Divergent Paraphrases from Twitter 2014 14 .pdf application/pdf 7515 858 64 We jointly model paraphrase relations between word and sentence model also captures lexically divergent paraphrases that differ from yet complement previous methods; combining our model with previous work significantly outperforms the stateof-the-art. Our approach to extract paraphrases from Twitter is general and can be combined with various topic detecting solutions. paraphrase identification, that specifically accommodates the very short context and divergent wording in Twitter data. 2 Joint Word-Sentence Paraphrase Model sentence-level annotations in our paraphrase corpus: In total, we constructed a Twitter Paraphrase Corpus of 18,762 sentence pairs and 19,946 unique sentences. Table 2: Performance of different paraphrase identification approaches on Twitter data. labels derived from 5 non-expert annotations on Mechanical Turk, which can be considered as an upperbound for automatic paraphrase recognition task We filter the sentences within each topic to select more probable paraphrases for annotation. paraphrase task requires additional sentence alignment modeling with no word alignment data. ./cache/work_n2ajueo2ozfp5hcybnms5ep22a.pdf ./txt/work_n2ajueo2ozfp5hcybnms5ep22a.txt