

README

This directory contains a Python script, a spaCy model, some sample data, and some sample output.

The model is a locally trained named-entity extractor. More specifically, the model is trained to identify human values. After running the script, identifying a plain text file, and loading the model, the script will output human values found in the text file. Sample usage:

  $ ./bin/extract-values.py ./sample-data/bacon.txt | sort | uniq -c | sort -rn | less

The script was applied to three different text files: 1) Francis Bacon's Essays, 2) Machiavelli's The Prince, and 3) the complete works of Shakespeare. The pie charts found in the png directory illustrate what human values were found in each of the files and to what degree.

The script could easily be modified to output the sentences whence the human values were found, and one might do this for verification purposes and/or closer reading.

The model is far from perfect, but it does output more than plausible results.

Fun!

--
Eric Lease Morgan <emorgan@nd.edu>
August 6, 2025