This is the stub of an index.html file; this file was automatically generated to describe the Distant Reader study carrel ("data set") it represents, specifically, a whole lot of an electronic journal called Changing Societies & Personalities. Even more specifically, this study carrel is a collection of content -- probably journal articles -- harvested and cached from an OAI-PMH repository as imeplemented by Open Journal System (OJS). Why probably? Becuase OJS is typically used to host and publish scholarly open access journal content. The OAI-PMH Data Repository root URL of the journal is https://changing-sp.com/ojs/index.php/csp/oai, and to browse the respository's content in it's raw form, start at https://changing-sp.com/ojs/index.php/csp/oai?verb=Identify
I harvested the content of this study carrel, and it includes content dated as early as 2017 and as late as 2025. There are 356 items ("articles") in the collection for a total of 2,452,525 words. By comparison, the Bible is about 800,000 words long and Melville's Moby Dick is about 250,000 words long. Now, ask yourself, "To what degree is this collection large or small?" Incidentlly, the collecton has an average Flesch readability score of 43, and based on my experience, scholarly journal articles usually have readability scores in between 50 and 60. The frequency of articles between 2017 and 2025 is visualized below, and now you can address the question, "To what degree has this journal been publishing consistently and to what extent?"

Date ranges
The scope of the collection has been modeled in a number of ways. The most rudimentary models are simple lists of the carrel's items and their bibliographic characteristics (authors, titles, dates, etc.). These models are available in both plain text and JSON formats. The former is easy to read, and the later is more computable. As an example of what can be done with the JSON file, you can quickly and easily garner the scope of the collection by reading the pathfinder.
The scope of the carrel can begin to be illustrated by observing the carrel's unigram, bigram, and computed keyword frequencies. These frequencies take a set of stop words into account, meaning, stop words are not included in the analysis. After observing the word clouds (below) you can begin to address the question, "What is this collection about? God? Knowledge? Truth? Justice? Beauty? If not, then what is it about?"
![]() unigrams |
![]() bigrams |
![]() keywords |
Topic modeling is an additional way to measure the aboutness of a corpus, and topic modeling is just as much of an art as well as a science. That said, after doing a bit of rudimentary topic modeling against this corpus, we might say it is about the following topics, where each topic ought to be read as if it were a hyphenated word made up by the feature words:
| labels | weights | features |
|---|---|---|
| russian | 0.41696 | russian social world society people societies time history |
| social | 0.27180 | social education development economic participation russian public countries |
| individuals | 0.22404 | social individuals model behavior gender age work analysis |
| children | 0.14097 | children family social russian russia women child migrants |
| art | 0.11875 | art city water urban industrial river place theatre |
| religious | 0.10638 | religious religion education religions donors violence donation islam |
| media | 0.08993 | media social body culture values image information communication |
| identity | 0.07492 | social identity indonesia group exclusion national people collective |

Topics

Topics over years
For more detail, see:
Eric Lease Morgan <eric_morgan@infomotions.com>
Date created: 2025-12-24