mv: 'input-file.zip' and './input-file.zip' are the same file Creating study carrel named subject-clock-freebo Initializing database Unzipping Archive: input-file.zip inflating: ./tmp/input/B06166.xml inflating: ./tmp/input/xml2htm.xsl inflating: ./tmp/input/A60474.xml inflating: ./tmp/input/metadata.csv inflating: ./tmp/input/A35722.xml inflating: ./tmp/input/A35726.xml caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: === metadata file: ./tmp/input/metadata.csv === found metadata file === updating bibliographic database Building study carrel named subject-clock-freebo May 24, 2021 5:00:28 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. May 24, 2021 5:00:28 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: Tesseract OCR is installed and will be automatically applied to image files unless you've excluded the TesseractOCRParser from the default parser. Tesseract may dramatically slow down content extraction (TIKA-2359). As of Tika 1.15 (and prior versions), Tesseract is automatically called. In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig. May 24, 2021 5:00:28 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. INFO Starting Apache Tika 1.24.1 server INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @2844ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 1.8.0_281-b09 INFO Started ServerConnector@3e74829{HTTP/1.1, (http/1.1)}{localhost:9998} INFO Started @2919ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@b4711e2{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) FILE: cache/B06166.xml OUTPUT: txt/B06166.txt FILE: cache/A60474.xml OUTPUT: txt/A60474.txt FILE: cache/A35726.xml OUTPUT: txt/A35726.txt FILE: cache/A35722.xml OUTPUT: txt/A35722.txt === file2bib.sh === INFO Detecting media type for Filename: b'B06166.xml' INFO Detecting media type for Filename: b'A60474.xml' INFO Detecting media type for Filename: b'A35726.xml' INFO Detecting media type for Filename: b'A35722.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) B06166 txt/../pos/B06166.pos B06166 txt/../wrd/B06166.wrd B06166 txt/../ent/B06166.ent A60474 txt/../pos/A60474.pos === file2bib.sh === id: B06166 author: Tompion, Thomas, 1639-1713. title: A table of the equation of days, shewing how much a good pendulum watch ought to be faster or slower than a true sun-dial, every day of the year. date: 1683 pages: extension: .xml txt: ./txt/B06166.txt cache: ./cache/B06166.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 6 resourceName b'B06166.xml' A35726 txt/../pos/A35726.pos A60474 txt/../wrd/A60474.wrd A60474 txt/../ent/A60474.ent A35726 txt/../wrd/A35726.wrd === file2bib.sh === id: A60474 author: Smith, John, fl. 1673-1680. title: Of the unequality of natural time, with its reason and cavses. together with a table of the true æquation of natvral dayes : drawn up chiefly for the use of the gentry, in order to their more true adjusting, and right managing of pendulum clocks, and watches / by John Smith ... date: 1686 pages: extension: .xml txt: ./txt/A60474.txt cache: ./cache/A60474.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 46 resourceName b'A60474.xml' === file2bib.sh === id: A35726 author: Derham, W. (William), 1657-1735. title: A supplement to the treatise of watch & clock-work called The artificial clock-maker ... by W.D., M.A. date: 1700 pages: extension: .xml txt: ./txt/A35726.txt cache: ./cache/A35726.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 66 resourceName b'A35726.xml' A35726 txt/../ent/A35726.ent A35722 txt/../pos/A35722.pos A35722 txt/../wrd/A35722.wrd A35722 txt/../ent/A35722.ent === file2bib.sh === id: A35722 author: Derham, W. (William), 1657-1735. title: The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. date: 1696 pages: extension: .xml txt: ./txt/A35722.txt cache: ./cache/A35722.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 91 resourceName b'A35722.xml' Done mapping. Reducing subject-clock-freebo === reduce.pl bib === id = A35722 author = Derham, W. (William), 1657-1735. title = The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. date = 1696 pages = extension = .xml mime = application/xml words = 28111 sentences = 9007 flesch = 100 summary = The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). cache = ./cache/A35722.xml txt = ./txt/A35722.txt === reduce.pl bib === id = A35726 author = Derham, W. (William), 1657-1735. title = A supplement to the treatise of watch & clock-work called The artificial clock-maker ... by W.D., M.A. date = 1700 pages = extension = .xml mime = application/xml words = 10120 sentences = 5103 flesch = 104 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. Monsieur Romer's satellite-instrument : with observation concerning the calculation of the eclipses of Jupiter's satellites, and to find the longitude by them, 3. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A35726.xml txt = ./txt/A35726.txt === reduce.pl bib === id = B06166 author = Tompion, Thomas, 1639-1713. title = A table of the equation of days, shewing how much a good pendulum watch ought to be faster or slower than a true sun-dial, every day of the year. date = 1683 pages = extension = .xml mime = application/xml words = 1812 sentences = 855 flesch = 101 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A table of the equation of days, shewing how much a good pendulum watch ought to be faster or slower than a true sun-dial, every day of the year. A table of the equation of days, shewing how much a good pendulum watch ought to be faster or slower than a true sun-dial, every day of the year. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). cache = ./cache/B06166.xml txt = ./txt/B06166.txt === reduce.pl bib === id = A60474 author = Smith, John, fl. 1673-1680. title = Of the unequality of natural time, with its reason and cavses. together with a table of the true æquation of natvral dayes : drawn up chiefly for the use of the gentry, in order to their more true adjusting, and right managing of pendulum clocks, and watches / by John Smith ... date = 1686 pages = extension = .xml mime = application/xml words = 7741 sentences = 2278 flesch = 97 summary = together with a table of the true æquation of natvral dayes : drawn up chiefly for the use of the gentry, in order to their more true adjusting, and right managing of pendulum clocks, and watches / by John Smith ... together with a table of the true æquation of natvral dayes : drawn up chiefly for the use of the gentry, in order to their more true adjusting, and right managing of pendulum clocks, and watches / by John Smith ... EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). cache = ./cache/A60474.xml txt = ./txt/A60474.txt Building ./etc/reader.txt A35722 A35726 A60474 B06166 A60474 A35726 number of items: 4 sum of words: 47,784 average size in words: 11,946 average readability score: 100 nouns: wheel; number; time; numbers; ▪; work; watch; day; turns; hours; seconds; part; hour; way; turn; table; days; viz; times; year; motion; quotient; use; pins; length; parts; month; piece; beats; strokes; daies; text; minutes; inches; t; p.; reason; motions; clocks; rule; pinion; example; clock; reader; minute; texts; others; end; wheels; hath verbs: is; be; are; have; was; find; set; being; make; do; go; were; made; been; add; according; had; say; see; shew; found; having; lose; gain; divide; divided; suppose; said; fixed; adjusted; let; following; take; multiplied; given; multiply; mentioned; going; give; encoded; done; called; published; think; turns; know; gives; come; ''s; note adjectives: other; first; great; same; last; true; slow; many; little; more; natural; such; whole; second; mean; equal; small; next; old; usual; much; several; good; exact; early; longer; shorter; plain; necessary; general; right; fourth; english; convenient; common; third; less; particular; own; ingenious; long; lesser; greater; possible; lower; former; better; available; useful; large adverbs: so; thus; not; too; then; more; therefore; only; before; out; also; very; fast; well; up; much; now; first; together; down; about; once; here; as; round; exactly; still; most; right; commonly; sometimes; somewhat; never; longer; in; far; accordingly; yet; faster; chiefly; almost; rather; perhaps; enough; again; forward; ever; long; thereof; partly pronouns: it; you; i; your; their; his; its; he; them; they; my; we; him; themselves; our; me; us; her; himself; itself; thy; she; paralel; one; mine proper nouns: clock; wheel; sun; pinion; c.; pendulum; watch; dial; 〉; ◊; mr; report; fusy; 〈; tcp; s; crown; m; minutes; l.; sect; sec; meridian; beats; wheels; jupiter; motion; chap; satellite; rule; mr.; ball; time; star; ●; margin; pinions; ▪; year; pendulums; clocks; ballance; piece; calculation; pole; p.; min; right; pend; hour keywords: sun; clock; wheel; watch; tcp; satellite; rule; report; pinion; pendulum; minutes; meridian; great; fusy; crown one topic; one dimension: wheel file(s): ./cache/A35722.xml titles(s): The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. three topics; one dimension: wheel; 14; shall file(s): ./cache/A35722.xml, ./cache/A35726.xml, ./cache/A60474.xml titles(s): The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. | A supplement to the treatise of watch & clock-work called The artificial clock-maker ... by W.D., M.A. | Of the unequality of natural time, with its reason and cavses. together with a table of the true æquation of natvral dayes : drawn up chiefly for the use of the gentry, in order to their more true adjusting, and right managing of pendulum clocks, and watches / by John Smith ... five topics; three dimensions: wheel number pinion; 14 13 15; shall time day; month clocks scotland; month clocks scotland file(s): ./cache/A35722.xml, ./cache/A35726.xml, ./cache/A60474.xml, ./cache/B06166.xml, ./cache/B06166.xml titles(s): The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. | A supplement to the treatise of watch & clock-work called The artificial clock-maker ... by W.D., M.A. | Of the unequality of natural time, with its reason and cavses. together with a table of the true æquation of natvral dayes : drawn up chiefly for the use of the gentry, in order to their more true adjusting, and right managing of pendulum clocks, and watches / by John Smith ... | A table of the equation of days, shewing how much a good pendulum watch ought to be faster or slower than a true sun-dial, every day of the year. | A table of the equation of days, shewing how much a good pendulum watch ought to be faster or slower than a true sun-dial, every day of the year. Type: zip2carrel title: subject-clock-freebo date: 2021-05-24 time: 16:59 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: input-file.zip ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: A35722 author: Derham, W. (William), 1657-1735. title: The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. date: 1696 words: 28111 sentences: 9007 pages: flesch: 100 cache: ./cache/A35722.xml txt: ./txt/A35722.txt summary: The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. The artificial clock-maker a treatise of watch, and clock-work, wherein the art of calculating numbers for most sorts of movements is explained to the capacity of the unlearned : also, the history of clock-work, both ancient and modern, with other useful matters, never before published / by W.D. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). id: A35726 author: Derham, W. (William), 1657-1735. title: A supplement to the treatise of watch & clock-work called The artificial clock-maker ... by W.D., M.A. date: 1700 words: 10120 sentences: 5103 pages: flesch: 104 cache: ./cache/A35726.xml txt: ./txt/A35726.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. Monsieur Romer''s satellite-instrument : with observation concerning the calculation of the eclipses of Jupiter''s satellites, and to find the longitude by them, 3. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A60474 author: Smith, John, fl. 1673-1680. title: Of the unequality of natural time, with its reason and cavses. together with a table of the true æquation of natvral dayes : drawn up chiefly for the use of the gentry, in order to their more true adjusting, and right managing of pendulum clocks, and watches / by John Smith ... date: 1686 words: 7741 sentences: 2278 pages: flesch: 97 cache: ./cache/A60474.xml txt: ./txt/A60474.txt summary: together with a table of the true æquation of natvral dayes : drawn up chiefly for the use of the gentry, in order to their more true adjusting, and right managing of pendulum clocks, and watches / by John Smith ... together with a table of the true æquation of natvral dayes : drawn up chiefly for the use of the gentry, in order to their more true adjusting, and right managing of pendulum clocks, and watches / by John Smith ... EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). id: B06166 author: Tompion, Thomas, 1639-1713. title: A table of the equation of days, shewing how much a good pendulum watch ought to be faster or slower than a true sun-dial, every day of the year. date: 1683 words: 1812 sentences: 855 pages: flesch: 101 cache: ./cache/B06166.xml txt: ./txt/B06166.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A table of the equation of days, shewing how much a good pendulum watch ought to be faster or slower than a true sun-dial, every day of the year. A table of the equation of days, shewing how much a good pendulum watch ought to be faster or slower than a true sun-dial, every day of the year. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel