mv: 'input-file.zip' and './input-file.zip' are the same file Creating study carrel named author-plutarch-freebo Initializing database Unzipping Archive: input-file.zip inflating: ./tmp/input/xml2htm.xsl inflating: ./tmp/input/metadata.csv inflating: ./tmp/input/A55194.xml inflating: ./tmp/input/A55203.xml inflating: ./tmp/input/A55206.xml inflating: ./tmp/input/A55202.xml inflating: ./tmp/input/A55198.xml caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: === metadata file: ./tmp/input/metadata.csv === found metadata file === updating bibliographic database Building study carrel named author-plutarch-freebo May 24, 2021 9:37:11 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. May 24, 2021 9:37:11 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: Tesseract OCR is installed and will be automatically applied to image files unless you've excluded the TesseractOCRParser from the default parser. Tesseract may dramatically slow down content extraction (TIKA-2359). As of Tika 1.15 (and prior versions), Tesseract is automatically called. In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig. May 24, 2021 9:37:11 AM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. INFO Starting Apache Tika 1.24.1 server INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @3077ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 1.8.0_281-b09 INFO Started ServerConnector@3e74829{HTTP/1.1, (http/1.1)}{localhost:9998} INFO Started @3178ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@51fadaff{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) FILE: cache/A55198.xml OUTPUT: txt/A55198.txt FILE: cache/A55202.xml OUTPUT: txt/A55202.txt FILE: cache/A55206.xml OUTPUT: txt/A55206.txt FILE: cache/A55203.xml OUTPUT: txt/A55203.txt FILE: cache/A55194.xml OUTPUT: txt/A55194.txt === file2bib.sh === INFO Detecting media type for Filename: b'A55198.xml' INFO Detecting media type for Filename: b'A55206.xml' INFO Detecting media type for Filename: b'A55202.xml' INFO Detecting media type for Filename: b'A55203.xml' INFO Detecting media type for Filename: b'A55194.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) A55202 txt/../pos/A55202.pos A55198 txt/../pos/A55198.pos A55206 txt/../pos/A55206.pos A55194 txt/../pos/A55194.pos A55203 txt/../pos/A55203.pos A55202 txt/../ent/A55202.ent A55198 txt/../ent/A55198.ent A55206 txt/../ent/A55206.ent A55194 txt/../ent/A55194.ent A55203 txt/../ent/A55203.ent A55202 txt/../wrd/A55202.wrd A55198 txt/../wrd/A55198.wrd A55206 txt/../wrd/A55206.wrd A55194 txt/../wrd/A55194.wrd A55203 txt/../wrd/A55203.wrd === file2bib.sh === id: A55202 author: Plutarch. title: The third volume of Plutarch's lives. Translated from the Greek, by several hands date: 1693 pages: extension: .xml txt: ./txt/A55202.txt cache: ./cache/A55202.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 299 resourceName b'A55202.xml' === file2bib.sh === id: A55198 author: Plutarch. title: The second volume of Plutarch's Lives Translated from the Greek, by several hands. date: 1688 pages: extension: .xml txt: ./txt/A55198.txt cache: ./cache/A55198.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 402 resourceName b'A55198.xml' === file2bib.sh === id: A55206 author: Plutarch. title: The fifth and last volume of Plutarchs Lives Translated from the Greek by several hands. date: 1693 pages: extension: .xml txt: ./txt/A55206.txt cache: ./cache/A55206.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 337 resourceName b'A55206.xml' === file2bib.sh === id: A55203 author: Plutarch. title: The fourth volume of Plutarch's Lives Translated from the Greek, by several hands. date: 1693 pages: extension: .xml txt: ./txt/A55203.txt cache: ./cache/A55203.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 356 resourceName b'A55203.xml' === file2bib.sh === id: A55194 author: Plutarch. title: Plutarch's Lives. Their first volume translated from the Greek by several hands ; to which is prefixt The life of Plutarch. date: 1683 pages: extension: .xml txt: ./txt/A55194.txt cache: ./cache/A55194.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 473 resourceName b'A55194.xml' Done mapping. Reducing author-plutarch-freebo === reduce.pl bib === id = A55202 author = Plutarch. title = The third volume of Plutarch's lives. Translated from the Greek, by several hands date = 1693 pages = extension = .xml mime = application/xml words = 139700 sentences = 39904 flesch = 91 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed by R.E. for Jacob Tonson, at the Judges-Head in Chancery-Lane, near Fleet-street, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A55202.xml txt = ./txt/A55202.txt === reduce.pl bib === id = A55194 author = Plutarch. title = Plutarch's Lives. Their first volume translated from the Greek by several hands ; to which is prefixt The life of Plutarch. date = 1683 pages = extension = .xml mime = application/xml words = 170278 sentences = 47980 flesch = 91 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A55194.xml txt = ./txt/A55194.txt === reduce.pl bib === id = A55198 author = Plutarch. title = The second volume of Plutarch's Lives Translated from the Greek, by several hands. date = 1688 pages = extension = .xml mime = application/xml words = 144328 sentences = 41989 flesch = 91 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed for Jacob Tonson, at the Judges-Head in Chancery-Lane, near Fleet-Street, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A55198.xml txt = ./txt/A55198.txt === reduce.pl bib === id = A55206 author = Plutarch. title = The fifth and last volume of Plutarchs Lives Translated from the Greek by several hands. date = 1693 pages = extension = .xml mime = application/xml words = 168248 sentences = 48499 flesch = 90 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed for Jacob Tonson at the Judge's-Head in Chancery-lane, near Fleet-street, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A55206.xml txt = ./txt/A55206.txt === reduce.pl bib === id = A55203 author = Plutarch. title = The fourth volume of Plutarch's Lives Translated from the Greek, by several hands. date = 1693 pages = extension = .xml mime = application/xml words = 186558 sentences = 55383 flesch = 92 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed for Jacob Tonson at the Judges Head in Chancery-lane, near FleetStreet, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A55203.xml txt = ./txt/A55203.txt Building ./etc/reader.txt A55203 A55194 A55206 A55206 A55203 A55202 number of items: 5 sum of words: 809,112 average size in words: 161,822 average readability score: 91 nouns: time; man; people; men; others; things; day; way; place; part; rest; thing; body; death; t; reason; enemies; nothing; manner; hands; side; friends; country; one; arms; self; life; order; person; war; name; power; occasion; money; hand; victory; number; use; days; mind; design; account; end; danger; horse; tho; fear; years; affairs; opinion verbs: was; had; were; be; being; is; made; have; did; having; came; said; been; are; sent; took; make; gave; put; brought; went; taken; do; set; say; thought; come; take; left; fell; give; done; found; began; see; commanded; told; taking; called; lay; making; got; given; saw; go; coming; stood; says; kept; carried adjectives: great; other; many; own; such; good; same; much; first; little; more; whole; young; several; greatest; common; most; last; old; publick; few; present; certain; long; very; small; ill; full; best; able; ready; greater; least; private; new; better; former; dead; less; free; considerable; true; next; mean; second; roman; short; poor; strong; ancient adverbs: not; so; up; very; now; then; out; as; only; most; more; also; much; therefore; well; thus; off; there; down; yet; together; first; away; soon; again; too; never; in; rather; even; still; over; indeed; ever; afterwards; far; immediately; presently; on; always; all; long; before; no; just; back; onely; here; often; however pronouns: he; his; him; they; their; it; them; himself; her; i; themselves; you; she; we; my; your; us; our; me; its; ''em; thy; one; thee; theirs; us''d; em; yours; mine; shou''d; ours; hers; theseus; herself; ye; s; myself; †; yourself; whosoever; itself; encompass''d; ''s; urg''d; unfurnish''d; transgress''d; offer''d; obs; o; judg''d proper nouns: 〉; ◊; 〈; city; army; caesar; king; war; pompey; romans; men; enemy; rome; cato; son; sea; senate; athenians; battel; alexander; general; camp; brutus; government; antony; citizens; people; friends; father; sylla; cicero; greece; athens; man; fortune; demetrius; marius; house; law; power; honour; affairs; horse; gods; antigonus; command; agesilaus; italy; lucullus; wife keywords: war; son; senate; sea; romans; people; men; man; life; king; government; friends; enemy; enemies; city; camp; body; battel; athenians; army; power; horse; general; country; citizens; affairs; wife; victory; soldiers; place; father; cities; river; laws; lacedaemonians; greeks; gods; forces; children; cato; caesar; action; women; titus; timoleon; theseus; themistocles; temple; sylla; state one topic; one dimension: great file(s): ./cache/A55194.xml titles(s): Plutarch''s Lives. Their first volume translated from the Greek by several hands ; to which is prefixt The life of Plutarch. three topics; one dimension: great; great; salutes file(s): ./cache/A55194.xml, ./cache/A55206.xml, ./cache/A55202.xml titles(s): Plutarch''s Lives. Their first volume translated from the Greek by several hands ; to which is prefixt The life of Plutarch. | The fifth and last volume of Plutarchs Lives Translated from the Greek by several hands. | The third volume of Plutarch''s lives. Translated from the Greek, by several hands five topics; three dimensions: great time men; great pompey caesar; did great people; tiberius alway sardinia; tiberius alway sardinia file(s): ./cache/A55194.xml, ./cache/A55203.xml, ./cache/A55198.xml, ./cache/A55202.xml, ./cache/A55202.xml titles(s): Plutarch''s Lives. Their first volume translated from the Greek by several hands ; to which is prefixt The life of Plutarch. | The fourth volume of Plutarch''s Lives Translated from the Greek, by several hands. | The second volume of Plutarch''s Lives Translated from the Greek, by several hands. | The third volume of Plutarch''s lives. Translated from the Greek, by several hands | The third volume of Plutarch''s lives. Translated from the Greek, by several hands Type: zip2carrel title: author-plutarch-freebo date: 2021-05-24 time: 09:13 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: input-file.zip ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: A55194 author: Plutarch. title: Plutarch''s Lives. Their first volume translated from the Greek by several hands ; to which is prefixt The life of Plutarch. date: 1683 words: 170278 sentences: 47980 pages: flesch: 91 cache: ./cache/A55194.xml txt: ./txt/A55194.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A55198 author: Plutarch. title: The second volume of Plutarch''s Lives Translated from the Greek, by several hands. date: 1688 words: 144328 sentences: 41989 pages: flesch: 91 cache: ./cache/A55198.xml txt: ./txt/A55198.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed for Jacob Tonson, at the Judges-Head in Chancery-Lane, near Fleet-Street, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A55202 author: Plutarch. title: The third volume of Plutarch''s lives. Translated from the Greek, by several hands date: 1693 words: 139700 sentences: 39904 pages: flesch: 91 cache: ./cache/A55202.xml txt: ./txt/A55202.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed by R.E. for Jacob Tonson, at the Judges-Head in Chancery-Lane, near Fleet-street, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A55203 author: Plutarch. title: The fourth volume of Plutarch''s Lives Translated from the Greek, by several hands. date: 1693 words: 186558 sentences: 55383 pages: flesch: 92 cache: ./cache/A55203.xml txt: ./txt/A55203.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed for Jacob Tonson at the Judges Head in Chancery-lane, near FleetStreet, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A55206 author: Plutarch. title: The fifth and last volume of Plutarchs Lives Translated from the Greek by several hands. date: 1693 words: 168248 sentences: 48499 pages: flesch: 90 cache: ./cache/A55206.xml txt: ./txt/A55206.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. printed for Jacob Tonson at the Judge''s-Head in Chancery-lane, near Fleet-street, EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel