id author title date pages extension mime words sentences flesch summary cache txt work_34l7lqxmcngh5ftsvxihpewbnu Gerold Schneider Parsing early and late modern English corpora 2014.0 17 .pdf application/pdf 9226 1108 70 We describe, evaluate, and improve the automatic annotation of diachronic corpora at the levels of word-class, lemma, chunks, and dependency syntax. the tagging changes that are due to the normalization and observe improvements, We evaluate the improvement on parsing performance, comparing original text, standard auxiliary verbs in the ZEN corpus, showing that despite high noise levels linguistic signals clearly emerge, opening new possibilities for large-scale research of and discuss the training and adaptation of the normalization tool VARD (Baron and Rayson, 2008). Table 1 Most frequent VARD normalizations of ZEN In order to evaluate the relative improvements between the original ZEN text (z0), the default VARD Robust broad-coverage syntactic parsers, for example, Collins (1999), Nivre (2006), Schneider sample from the ARCHER corpus 17th century section, 131 normalizations are made (in VARD batch Of the 422 sentences, 332 obtain a different syntactic analysis when using the standard VARD settings. ./cache/work_34l7lqxmcngh5ftsvxihpewbnu.pdf ./txt/work_34l7lqxmcngh5ftsvxihpewbnu.txt