Ask Me Anything

This blog posting describes a Python script called Ask Me Anything -- a rudimentary chatbot.

Run the script (ask-me-anything.py) and give it one of two different command-line arguments: 1) homer, or 2) austen. For example:

  $ ./ask-me-anything.py homer

The script will read all the .csv files in the corresponding command-line argument directory, and then prompt you to enter a question. Questions can be vary from the very simple to the very complex. Once a question is submitted, the system will return a list of questions and answers kinda-, sorta- matching the question. (The script uses a measurement called "cosine similarity" to do this work.) Each question ought to be more-than-plausible as well as associated with a confidence score. (No scores should be less than .5 and very few scores will equal 1.0.) Once the question/answers are output, you will prompted for another question. Press ^c to stop asking questions.

The two underlying models (homer and austen) correspond to the Iliad/Odyssey by Homer and Jane Austen's Emma.

Begin by asking very simple one-word questions. For example, submit question word like who, what, when, where, why, or how. Alternatively, enter words of interest. I always enter words like love, honor, truth, justice, and beauty. Continue by asking more specific questions, questions that begin with a question word and are followed by a verb. Examples include: who was, who is, who will be, how many, how did, when was, etc. Once you get that far, do two things. First, complete the rudimentary questions with words of interest. For example, "who was she", or "where did they". Second, use the results of all your queries as hints for other questions. The asking of questions is an iterative process. The system is dumb; the system does not implement any grammar. Thus, "who is" may return nothing but "who was" will return something. Enhancing the sytem to support grammar is an exercise left up to the reader. :)

Here is an abbreviated transcript:

  $ ./ask-me-anything.py homer

  Ask me anything: who is hector
    * Who is Hector's son? (Peleus / 0.72)
    * What is with Hector? (the hand of heaven / 0.65)
    * What is with Hector? (Hector is bringing them up in force / 0.56)
    * What is Hector's name? (Deiphobus / 0.51)
    ...
  
  Ask me anything: what did hector do
    * What did Hector do? (hound the proud Trojans on against the Achaeans / 0.54)
    * What did Hector do? (Twelve men did he thus kill / 0.57)
    * What did Hector do? (killed Periphetes of Mycenae / 0.61)
    * What did Hector do to Hector? (struck him in the lower part of the belly / 0.53)
    * What did Hector do to Hector? (made Hector turn fainthearted / 0.51)
    * What did Hector do to me? (slay me / 0.6)
    * What did Hector son of Priam do? (take the life / 0.62)
    * What did Hector send you to do? (plunder the bodies of the slain / 0.57)
    ...

  Ask me anything: what is war
    * Who is war's steward? (Jove / 0.98)
    * What is beyond question in war? (your prowess / 0.92)

  Ask me anything: ^C
  Okay, bye bye, and thank you.

  $ ./ask-me-anything.py austen

  Ask me anything: who is emma
    * What is Emma's name? (Emma Woodhouse / 0.5)
    * Who is in the way of Emma? (His aunt / 0.52)
    * Who is Emma's confidant? (Harriet / 0.55)

  Ask me anything: what is love
    * Who is Mr. Elton in love with? (Harriet / 0.99)
    * Who is the man who is in love with Harriet Smith? (Mr. Knightley / 0.5)
    * What did he love about her? (thorough excellence of mind / 0.6)
    * What is the strange thing about love? (he can see ready wit in Harriet / 0.55)

  Ask me anything: where did elton go
    * Where did he go? (Highbury / 0.88)
    * Where did Jane go the next day? (Highbury / 0.94)

  Ask me anything: ^C
  Okay, bye bye, and thank you.

Why did I write this script? Because there are many ways to read a text, and by transforming the text into a set of questions, one can garner a whole lot of meaning, and more specifically, this process returns real-word, human-intelligible results instead of obtuse counts, tabulations, charts, graphs, frequencies, etc. I'll bet that if you play with this system for ten or fifteen minutes, then you will go away from the process knowing much more about the texts.

Imagine if libraries were to digitize each & every book in their collection and then, inside the MARC records (think 856$u) there be a link to an implementation of Ask Me Anything. Think how such a service would enhance librarianship.

If somebody asks, I will elaborate on how I created the model(s) in the first place, but only after they take a closer look at the .csv files in the given model directories. Download the script as well as its models from ask-me-anything.zip.


Creator: Eric Lease Morgan <emorgan@nd.edu>
Source: This is the initial posting of this document.
Date created: 2023-11-07
Date updated: 2023-11-07
Subject(s): hacks; chatbots;
URL: https://distantreader.org/blog/ask-me-anything/