mv: 'input-file.zip' and './input-file.zip' are the same file Creating study carrel named subject-unitarian-freebo Initializing database Unzipping Archive: input-file.zip inflating: ./tmp/input/A26746.xml inflating: ./tmp/input/xml2htm.xsl inflating: ./tmp/input/metadata.csv inflating: ./tmp/input/A40444.xml inflating: ./tmp/input/A52606.xml inflating: ./tmp/input/A23823.xml caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: === metadata file: ./tmp/input/metadata.csv === found metadata file === updating bibliographic database Building study carrel named subject-unitarian-freebo May 25, 2021 12:43:39 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. May 25, 2021 12:43:39 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: Tesseract OCR is installed and will be automatically applied to image files unless you've excluded the TesseractOCRParser from the default parser. Tesseract may dramatically slow down content extraction (TIKA-2359). As of Tika 1.15 (and prior versions), Tesseract is automatically called. In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig. May 25, 2021 12:43:39 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. INFO Starting Apache Tika 1.24.1 server INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @3760ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 1.8.0_281-b09 INFO Started ServerConnector@3e74829{HTTP/1.1, (http/1.1)}{localhost:9998} INFO Started @3896ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@62010f5c{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) FILE: cache/A40444.xml OUTPUT: txt/A40444.txt FILE: cache/A52606.xml OUTPUT: txt/A52606.txt FILE: cache/A23823.xml OUTPUT: txt/A23823.txt FILE: cache/A26746.xml OUTPUT: txt/A26746.txt === file2bib.sh === INFO Detecting media type for Filename: b'A40444.xml' INFO Detecting media type for Filename: b'A52606.xml' INFO Detecting media type for Filename: b'A26746.xml' INFO Detecting media type for Filename: b'A23823.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) A40444 txt/../pos/A40444.pos A26746 txt/../pos/A26746.pos A52606 txt/../pos/A52606.pos A40444 txt/../wrd/A40444.wrd A23823 txt/../pos/A23823.pos A40444 txt/../ent/A40444.ent A52606 txt/../wrd/A52606.wrd === file2bib.sh === id: A40444 author: Freke, William, 1662-1744. title: A vindication of the Unitarians, against a late reverend author on the Trinity date: 1687 pages: extension: .xml txt: ./txt/A40444.txt cache: ./cache/A40444.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 88 resourceName b'A40444.xml' A26746 txt/../wrd/A26746.wrd A23823 txt/../wrd/A23823.wrd A23823 txt/../ent/A23823.ent A26746 txt/../ent/A26746.ent === file2bib.sh === id: A52606 author: Biddle, John, 1615-1662. title: A brief history of the Unitarians, called also Socinians in four letters, written to a friend. date: 1687 pages: extension: .xml txt: ./txt/A52606.txt cache: ./cache/A52606.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 97 resourceName b'A52606.xml' === file2bib.sh === id: A23823 author: Allix, Pierre, 1641-1717. title: A Defence of the Brief history of the Unitarians, against Dr. Sherlock's answer in his Vindication of the Holy Trinity date: 1691 pages: extension: .xml txt: ./txt/A23823.txt cache: ./cache/A23823.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 100 resourceName b'A23823.xml' A52606 txt/../ent/A52606.ent === file2bib.sh === id: A26746 author: Basset, William, 1644-1695. title: An answer to the Brief history of the Unitarians, called also Socinians by William Basset ... date: 1693 pages: extension: .xml txt: ./txt/A26746.txt cache: ./cache/A26746.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 96 resourceName b'A26746.xml' Done mapping. Reducing subject-unitarian-freebo === reduce.pl bib === id = A26746 author = Basset, William, 1644-1695. title = An answer to the Brief history of the Unitarians, called also Socinians by William Basset ... date = 1693 pages = extension = .xml mime = application/xml words = 33590 sentences = 11165 flesch = 96 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. An answer to the Brief history of the Unitarians, called also Socinians by William Basset ... EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A26746.xml txt = ./txt/A26746.txt === reduce.pl bib === id = A52606 author = Biddle, John, 1615-1662. title = A brief history of the Unitarians, called also Socinians in four letters, written to a friend. date = 1687 pages = extension = .xml mime = application/xml words = 31065 sentences = 10505 flesch = 95 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A52606.xml txt = ./txt/A52606.txt === reduce.pl bib === id = A40444 author = Freke, William, 1662-1744. title = A vindication of the Unitarians, against a late reverend author on the Trinity date = 1687 pages = extension = .xml mime = application/xml words = 18578 sentences = 6333 flesch = 97 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A40444.xml txt = ./txt/A40444.txt === reduce.pl bib === id = A23823 author = Allix, Pierre, 1641-1717. title = A Defence of the Brief history of the Unitarians, against Dr. Sherlock's answer in his Vindication of the Holy Trinity date = 1691 pages = extension = .xml mime = application/xml words = 38756 sentences = 12177 flesch = 95 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A Defence of the Brief history of the Unitarians, against Dr. Sherlock's answer in his Vindication of the Holy Trinity EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A23823.xml txt = ./txt/A23823.txt Building ./etc/reader.txt A23823 A52606 A26746 A23823 A52606 A40444 number of items: 4 sum of words: 121,989 average size in words: 30,497 average readability score: 95 nouns: things; words; author; person; answ; scripture; man; name; t; reason; sense; thing; place; argument; men; nothing; scriptures; beginning; time; doth; self; nature; texts; others; truth; way; p.; glory; fathers; viz; power; none; ▪; mystery; word; hypothesis; creature; one; whence; body; contradiction; hath; page; heb; apostle; text; part; faith; gifts; answer verbs: is; be; was; are; have; were; say; do; had; made; said; has; called; did; being; been; prove; see; make; does; know; says; let; given; think; born; believe; am; according; created; give; spoken; speak; baptized; come; sent; done; deny; speaks; tell; saith; concerning; grant; shew; knows; take; find; thought; put; makes adjectives: other; true; same; own; such; first; great; more; good; plain; whole; many; saith; false; greater; distinct; only; impossible; second; proper; much; last; invisible; absurd; common; very; next; equal; contrary; necessary; socinian; little; least; clear; several; able; natural; infinite; subject; eternal; general; evident; high; express; doth; particular; humane; early; right; - adverbs: not; so; only; therefore; then; thus; as; also; here; now; indeed; more; most; well; even; is; yet; first; never; ever; very; that; too; up; much; plainly; always; consequently; again; all; far; before; together; there; else; rather; out; ver; often; no; forth; already; down; in; sometimes; on; hence; likewise; expresly; certainly pronouns: he; it; his; i; him; they; you; we; our; them; us; himself; their; my; me; your; themselves; thy; its; thee; one; her; ''em; ye; she; yours; myself; theirs; mine; ours; itself; em; us''d; yourself; saythat; oppos''d; honours proper nouns: god; christ; father; son; ◊; 〉; 〈; holy; ghost; lord; spirit; john; man; word; world; power; doctrine; text; men; c.; jesus; st.; church; divine; persons; trinity; nature; socinians; angels; sir; i.; e.; heaven; cor; earth; pag; thou; jehovah; creed; socinian; faith; divinity; scripture; person; gospel; texts; creature; kingdom; hath; paul keywords: son; holy; god; father; ghost; christ; spirit; power; doctrine; word; text; socinians; socinian; sir; scripture; person; nature; lord; hypothesis; divinity; cor; author one topic; one dimension: god file(s): ./cache/A26746.xml titles(s): An answer to the Brief history of the Unitarians, called also Socinians by William Basset ... three topics; one dimension: god; god; block file(s): ./cache/A23823.xml, ./cache/A40444.xml, ./cache/A40444.xml titles(s): A Defence of the Brief history of the Unitarians, against Dr. Sherlock''s answer in his Vindication of the Holy Trinity | A vindication of the Unitarians, against a late reverend author on the Trinity | A vindication of the Unitarians, against a late reverend author on the Trinity five topics; three dimensions: god christ father; god sir son; humanity exact block; humanity exact block; humanity exact block file(s): ./cache/A23823.xml, ./cache/A40444.xml, ./cache/A40444.xml, ./cache/A40444.xml, ./cache/A40444.xml titles(s): A Defence of the Brief history of the Unitarians, against Dr. Sherlock''s answer in his Vindication of the Holy Trinity | A vindication of the Unitarians, against a late reverend author on the Trinity | A vindication of the Unitarians, against a late reverend author on the Trinity | A vindication of the Unitarians, against a late reverend author on the Trinity | A vindication of the Unitarians, against a late reverend author on the Trinity Type: zip2carrel title: subject-unitarian-freebo date: 2021-05-25 time: 12:33 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: input-file.zip ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: A23823 author: Allix, Pierre, 1641-1717. title: A Defence of the Brief history of the Unitarians, against Dr. Sherlock''s answer in his Vindication of the Holy Trinity date: 1691 words: 38756 sentences: 12177 pages: flesch: 95 cache: ./cache/A23823.xml txt: ./txt/A23823.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A Defence of the Brief history of the Unitarians, against Dr. Sherlock''s answer in his Vindication of the Holy Trinity EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A26746 author: Basset, William, 1644-1695. title: An answer to the Brief history of the Unitarians, called also Socinians by William Basset ... date: 1693 words: 33590 sentences: 11165 pages: flesch: 96 cache: ./cache/A26746.xml txt: ./txt/A26746.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. An answer to the Brief history of the Unitarians, called also Socinians by William Basset ... EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A52606 author: Biddle, John, 1615-1662. title: A brief history of the Unitarians, called also Socinians in four letters, written to a friend. date: 1687 words: 31065 sentences: 10505 pages: flesch: 95 cache: ./cache/A52606.xml txt: ./txt/A52606.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. id: A40444 author: Freke, William, 1662-1744. title: A vindication of the Unitarians, against a late reverend author on the Trinity date: 1687 words: 18578 sentences: 6333 pages: flesch: 97 cache: ./cache/A40444.xml txt: ./txt/A40444.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO. EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel