mv: ‘./input-file.zip’ and ‘./input-file.zip’ are the same file Creating study carrel named subject-falstaffJohnSirFictitiousCharacter-gutenberg Initializing database Unzipping Archive: input-file.zip creating: ./tmp/input/input-file/ inflating: ./tmp/input/input-file/24500.txt inflating: ./tmp/input/input-file/1116.txt inflating: ./tmp/input/input-file/1781.txt inflating: ./tmp/input/input-file/1517.txt inflating: ./tmp/input/input-file/2237.txt inflating: ./tmp/input/input-file/metadata.csv caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: ./tmp/input/input-file === metadata file: ./tmp/input/input-file/metadata.csv === found metadata file === updating bibliographic database Building study carrel named subject-falstaffJohnSirFictitiousCharacter-gutenberg FILE: cache/24500.txt OUTPUT: txt/24500.txt FILE: cache/1116.txt OUTPUT: txt/1116.txt FILE: cache/1781.txt OUTPUT: txt/1781.txt FILE: cache/1517.txt OUTPUT: txt/1517.txt FILE: cache/2237.txt OUTPUT: txt/2237.txt 2237 txt/../ent/2237.ent === file2bib.sh === id: 24500 author: Acheson, Arthur title: Shakespeare's Lost Years in London, 1586-1592 date: pages: extension: .txt txt: ./txt/24500.txt cache: ./cache/24500.txt Content-Encoding ISO-8859-1 Content-Type text/plain; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 resourceName b'24500.txt' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' === file2bib.sh === id: 2237 author: Shakespeare, William title: The Merry Wives of Windsor date: pages: extension: .txt txt: ./txt/2237.txt cache: ./cache/2237.txt Content-Encoding ISO-8859-1 Content-Type text/plain; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1 resourceName b'2237.txt' Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/file2bib.py", line 107, in text = textacy.preprocessing.normalize.normalize_quotation_marks( text ) File "/data-disk/python/lib/python3.8/site-packages/textacy/preprocessing/normalize.py", line 32, in normalize_quotation_marks return text.translate(QUOTE_TRANSLATION_TABLE) AttributeError: 'NoneType' object has no attribute 'translate' 24500 txt/../ent/24500.ent 24500 txt/../wrd/24500.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point 1781 txt/../ent/1781.ent 1781 txt/../pos/1781.pos 1116 txt/../ent/1116.ent 1781 txt/../wrd/1781.wrd 2237 txt/../wrd/2237.wrd Traceback (most recent call last): File "/data-disk/reader-compute/reader-classic/bin/txt2keywords.py", line 54, in for keyword, score in ( yake( doc, ngrams=NGRAMS, topn=TOPN ) ) : File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 96, in yake word_scores = _compute_word_scores(doc, word_occ_vals, word_freqs, stop_words) File "/data-disk/python/lib/python3.8/site-packages/textacy/ke/yake.py", line 205, in _compute_word_scores freq_baseline = statistics.mean(freqs_nsw) + statistics.stdev(freqs_nsw) File "/data-disk/python/lib/python3.8/statistics.py", line 315, in mean raise StatisticsError('mean requires at least one data point') statistics.StatisticsError: mean requires at least one data point 2237 txt/../pos/2237.pos 24500 txt/../pos/24500.pos 1116 txt/../pos/1116.pos 1116 txt/../wrd/1116.wrd === file2bib.sh === id: 1116 author: Shakespeare, William title: The Merry Wives of Windsor date: pages: extension: .txt txt: ./txt/1116.txt cache: ./cache/1116.txt Content-Encoding UTF-8 Content-Type text/plain; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 resourceName b'1116.txt' === file2bib.sh === id: 1781 author: Shakespeare, William title: The Merry Wives of Windsor date: pages: extension: .txt txt: ./txt/1781.txt cache: ./cache/1781.txt Content-Encoding UTF-8 Content-Type text/plain; charset=UTF-8 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 2 resourceName b'1781.txt' 1517 txt/../wrd/1517.wrd 1517 txt/../pos/1517.pos === file2bib.sh === id: 1517 author: Shakespeare, William title: The Merry Wives of Windsor date: pages: extension: .txt txt: ./txt/1517.txt cache: ./cache/1517.txt Content-Encoding ISO-8859-1 Content-Type text/plain; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 4 resourceName b'1517.txt' 1517 txt/../ent/1517.ent Done mapping. Reducing subject-falstaffJohnSirFictitiousCharacter-gutenberg === reduce.pl bib === === reduce.pl bib === id = 1116 author = Shakespeare, William title = The Merry Wives of Windsor date = pages = extension = .txt mime = text/plain words = 40 sentences = 10 flesch = 88 summary = THIS EBOOK WAS ONE OF PROJECT GUTENBERG'S EARLY FILES PRODUCED AT A TIME WHEN PROOFING METHODS AND TOOLS WERE NOT WELL DEVELOPED. IS AN IMPROVED EDITION OF THIS TITLE WHICH MAY BE VIEWED AS EBOOK (#1517) at https://www.gutenberg.org/ebooks/1517 cache = ./cache/1116.txt txt = ./txt/1116.txt === reduce.pl bib === id = 1781 author = Shakespeare, William title = The Merry Wives of Windsor date = pages = extension = .txt mime = text/plain words = 40 sentences = 10 flesch = 88 summary = THIS EBOOK WAS ONE OF PROJECT GUTENBERG'S EARLY FILES PRODUCED AT A TIME WHEN PROOFING METHODS AND TOOLS WERE NOT WELL DEVELOPED. IS AN IMPROVED EDITION OF THIS TITLE WHICH MAY BE VIEWED AS EBOOK (#1517) at https://www.gutenberg.org/ebooks/1517 cache = ./cache/1781.txt txt = ./txt/1781.txt === reduce.pl bib === === reduce.pl bib === id = 1517 author = Shakespeare, William title = The Merry Wives of Windsor date = pages = extension = .txt mime = text/plain words = 23987 sentences = 3557 flesch = 100 summary = and desire a marriage between Master Abraham and Mistress Anne Page. Mistress Anne Page for my master, in the way of marriage. Let it suffice thee, Mistress Page, at the least, if the love There is one Mistress Ford, sir,--I pray, come a little nearer this Speak, good Master Brook; I shall be glad to be your servant. Want no Mistress Ford, Master Brook; you shall want none. I pray you now, good Master Slender's serving-man, and friend [Enter PAGE, SHALLOW, SLENDER, HOST, SIR HUGH EVANS, My Master Sir John is come in at your back-door, Mistress Ford, [Re-enter FORD, PAGE, CAIUS, and SIR HUGH EVANS.] [Re-enter FORD, PAGE, CAIUS, and SIR HUGH EVANS.] Marry, sir, I come to your worship from Mistress Ford. Now, Master Brook, you come to know what hath passed between me Why, sir, they were nothing but about Mistress Anne Page: to know cache = ./cache/1517.txt txt = ./txt/1517.txt Building ./etc/reader.txt 1517 24500 2237 1517 24500 2237 number of items: 5 sum of words: 24,067 average size in words: 8,022 average readability score: 92 nouns: sir; caius; man; woman; wife; exit; husband; house; pistol; time; master; way; basket; love; host; heart; word; scene; knight; name; letter; hand; humour; head; hath; boy; worship; men; fairies; money; knave; daughter; night; matter; father; buck; peace; mind; gentleman; page; none; mine; gentlemen; will; room; re; maid; doctor; hour; day verbs: is; be; have; come; do; go; am; let; ''s; are; enter; say; see; know; had; tell; pray; make; was; were; give; think; speak; take; been; hear; follow; did; comes; love; look; has; warrant; find; does; thank; made; keep; hath; marry; forsooth; call; desire; send; hold; carry; leave; hang; says; meet adjectives: good; more; old; honest; own; great; other; young; true; little; glad; better; such; sweet; poor; mine; fair; many; dead; fat; white; simple; much; mad; jealous; best; wise; very; sure; same; jest; hot; green; foul; long; ill; bold; welcome; quick; open; loose; like; french; drunk; wicked; well; short; reasonable; ready; new adverbs: not; so; here; now; quickly; well; then; as; too; never; away; there; out; indeed; in; up; yet; very; again; aside; rather; down; ever; truly; on; no; first; together; tis; thus; more; once; much; long; home; forth; else; still; soon; even; only; off; most; enough; all; ill; better; before; also; straight pronouns: i; you; my; he; me; your; it; her; him; his; she; we; they; them; thee; thy; their; our; us; myself; himself; yourself; mine; ''s; themselves; itself; herself; thyself; ourselves; on''t; ''em proper nouns: page; ford; mrs; falstaff; master; mistress; evans; sir; slender; shallow; .; anne; john; host; thou; fenton; brook; simple; hugh; bardolph; rugby; windsor; gar; doctor; exeunt; hath; william; nym; caius; quickly; robin; heaven; garter; enter; de; marry; jack; tis; pistol; herne; nan; dat; parson; inn; good; welsh; park; got; english; adieu keywords: ebook; slender; shallow; quickly; page; mrs; mistress; master; ford; falstaff; evans; caiu one topic; one dimension: page file(s): titles(s): Shakespeare's Lost Years in London, 1586-1592 three topics; one dimension: page; ebooks; ebooks file(s): ./cache/1517.txt, , titles(s): The Merry Wives of Windsor | Shakespeare's Lost Years in London, 1586-1592 | Shakespeare's Lost Years in London, 1586-1592 five topics; three dimensions: page ford mrs; ebooks www methods; ebooks www methods; ebooks www methods; ebooks www methods file(s): ./cache/1517.txt, , , , titles(s): The Merry Wives of Windsor | Shakespeare's Lost Years in London, 1586-1592 | Shakespeare's Lost Years in London, 1586-1592 | Shakespeare's Lost Years in London, 1586-1592 | Shakespeare's Lost Years in London, 1586-1592 Type: gutenberg title: subject-falstaffJohnSirFictitiousCharacter-gutenberg date: 2021-06-06 time: 15:06 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: facet_subject:"Falstaff, John, Sir (Fictitious character)" ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: 24500 author: Acheson, Arthur title: Shakespeare's Lost Years in London, 1586-1592 date: words: nan sentences: nan pages: flesch: nan cache: txt: summary: id: 1116 author: Shakespeare, William title: The Merry Wives of Windsor date: words: 40.0 sentences: 10.0 pages: flesch: 88.0 cache: ./cache/1116.txt txt: ./txt/1116.txt summary: THIS EBOOK WAS ONE OF PROJECT GUTENBERG''S EARLY FILES PRODUCED AT A TIME WHEN PROOFING METHODS AND TOOLS WERE NOT WELL DEVELOPED. IS AN IMPROVED EDITION OF THIS TITLE WHICH MAY BE VIEWED AS EBOOK (#1517) at https://www.gutenberg.org/ebooks/1517 id: 1781 author: Shakespeare, William title: The Merry Wives of Windsor date: words: 40.0 sentences: 10.0 pages: flesch: 88.0 cache: ./cache/1781.txt txt: ./txt/1781.txt summary: THIS EBOOK WAS ONE OF PROJECT GUTENBERG''S EARLY FILES PRODUCED AT A TIME WHEN PROOFING METHODS AND TOOLS WERE NOT WELL DEVELOPED. IS AN IMPROVED EDITION OF THIS TITLE WHICH MAY BE VIEWED AS EBOOK (#1517) at https://www.gutenberg.org/ebooks/1517 id: 1517 author: Shakespeare, William title: The Merry Wives of Windsor date: words: 23987.0 sentences: 3557.0 pages: flesch: 100.0 cache: ./cache/1517.txt txt: ./txt/1517.txt summary: and desire a marriage between Master Abraham and Mistress Anne Page. Mistress Anne Page for my master, in the way of marriage. Let it suffice thee, Mistress Page, at the least, if the love There is one Mistress Ford, sir,--I pray, come a little nearer this Speak, good Master Brook; I shall be glad to be your servant. Want no Mistress Ford, Master Brook; you shall want none. I pray you now, good Master Slender''s serving-man, and friend [Enter PAGE, SHALLOW, SLENDER, HOST, SIR HUGH EVANS, My Master Sir John is come in at your back-door, Mistress Ford, [Re-enter FORD, PAGE, CAIUS, and SIR HUGH EVANS.] [Re-enter FORD, PAGE, CAIUS, and SIR HUGH EVANS.] Marry, sir, I come to your worship from Mistress Ford. Now, Master Brook, you come to know what hath passed between me Why, sir, they were nothing but about Mistress Anne Page: to know id: 2237 author: Shakespeare, William title: The Merry Wives of Windsor date: words: nan sentences: nan pages: flesch: nan cache: txt: summary: ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel