id author title date pages extension mime words sentences flesch summary cache txt work_qblac7ctoveq5ldft75qxl6k3a Lidia Pivovarova Word Clustering for Historical Newspapers Analysis 2019 8 .pdf application/pdf 4464 331 52 Word Clustering for Historical Newspapers Analysis We propose a two-step procedure to trace differences in word usages over time: training This paper focuses on a case study of ideological terms ending with -ism suffix—such as liberalism, socialism, or conservatism—in nineteenth century Finnish newspapers and how usage of ideological isms is different from other words with Figure 1: A selection of the most frequent words ending with suffix -ism/ismi. We cluster word embeddings into semantically only clusters that contain at least one ism word are Finnish data that contain words related to rheumaTable 3: Clusters containing Finnish words related to rheumatism. Swedish data that contain the word spiritism. Table 4: Clusters containing Swedish word spiritism. Table 5: Swedish clusters containing word separatism Table 6: Finnish clusters containing word separatismi The 1880-1899 cluster contains completely different set of words, including reference to specific political entities, such as clustering embeddings of selected words together ./cache/work_qblac7ctoveq5ldft75qxl6k3a.pdf ./txt/work_qblac7ctoveq5ldft75qxl6k3a.txt