id author title date pages extension mime words sentences flesch summary cache txt work_tksk7sfy4vfgxpiexqvpfxwrmu Ehud Alexander Avner Identifying translationese at the word and sub-word level 2014.0 44 .pdf application/pdf 13056 1114 62 We use text classification to distinguish automatically between original and translated texts in Hebrew, a morphologically complex language. classification techniques can distinguish between original and translated texts with high accuracy, and indeed, several translationese classifiers have been defined for a few European with numerous feature sets: frequencies of unigrams, bigrams, and trigrams of words, lemmas, part-of-speech (POS) tags, and a mixed mode in which function words are left untouched in their surface form, while content words are substituted by their corresponding cal purpose of developing a classifier for translationese and "explore the characteristic [universal] features which most influence the translated language" (p. In addition, they train a classifier on EUROPARL and test it on a different corpus containing newspaper articles in original Secondly, we test how well the classifiers predict translationese in a different domain, but on texts translated from the same source language as the training data ./cache/work_tksk7sfy4vfgxpiexqvpfxwrmu.pdf ./txt/work_tksk7sfy4vfgxpiexqvpfxwrmu.txt