mv: 'input-file.zip' and './input-file.zip' are the same file Creating study carrel named subject-smoking-freebo Initializing database Unzipping Archive: input-file.zip inflating: ./tmp/input/A01425.xml inflating: ./tmp/input/A38822.xml inflating: ./tmp/input/A14326.xml inflating: ./tmp/input/A20030.xml inflating: ./tmp/input/A19997.xml inflating: ./tmp/input/xml2htm.xsl inflating: ./tmp/input/A16679.xml inflating: ./tmp/input/metadata.csv inflating: ./tmp/input/A04242.xml inflating: ./tmp/input/A87472.xml caution: excluded filename not matched: *MACOSX* === DIRECTORIES: ./tmp/input === DIRECTORY: === metadata file: ./tmp/input/metadata.csv === found metadata file === updating bibliographic database Building study carrel named subject-smoking-freebo May 25, 2021 12:11:42 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: J2KImageReader not loaded. JPEG2000 files will not be processed. See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io for optional dependencies. May 25, 2021 12:11:42 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: Tesseract OCR is installed and will be automatically applied to image files unless you've excluded the TesseractOCRParser from the default parser. Tesseract may dramatically slow down content extraction (TIKA-2359). As of Tika 1.15 (and prior versions), Tesseract is automatically called. In future versions of Tika, users may need to turn the TesseractOCRParser on via TikaConfig. May 25, 2021 12:11:42 PM org.apache.tika.config.InitializableProblemHandler$3 handleInitializableProblem WARNING: org.xerial's sqlite-jdbc is not loaded. Please provide the jar on your classpath to parse sqlite files. See tika-parsers/pom.xml for the correct version. INFO Starting Apache Tika 1.24.1 server INFO Setting the server's publish address to be http://localhost:9998/ INFO Logging initialized @3165ms to org.eclipse.jetty.util.log.Slf4jLog INFO jetty-9.4.27.v20200227; built: 2020-02-27T18:37:21.340Z; git: a304fd9f351f337e7c0e2a7c28878dd536149c6c; jvm 1.8.0_281-b09 INFO Started ServerConnector@3e74829{HTTP/1.1, (http/1.1)}{localhost:9998} INFO Started @3265ms WARN Empty contextPath INFO Started o.e.j.s.h.ContextHandler@62010f5c{/,null,AVAILABLE} INFO Started Apache Tika server at http://localhost:9998/ INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) FILE: cache/A04242.xml OUTPUT: txt/A04242.txt FILE: cache/A01425.xml OUTPUT: txt/A01425.txt FILE: cache/A14326.xml OUTPUT: txt/A14326.txt FILE: cache/A20030.xml OUTPUT: txt/A20030.txt FILE: cache/A38822.xml OUTPUT: txt/A38822.txt FILE: cache/A16679.xml OUTPUT: txt/A16679.txt FILE: cache/A87472.xml OUTPUT: txt/A87472.txt FILE: cache/A19997.xml OUTPUT: txt/A19997.txt === file2bib.sh === INFO Detecting media type for Filename: b'A04242.xml' INFO Detecting media type for Filename: b'A14326.xml' INFO Detecting media type for Filename: b'A01425.xml' INFO Detecting media type for Filename: b'A20030.xml' INFO Detecting media type for Filename: b'A16679.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO Detecting media type for Filename: b'A87472.xml' INFO Detecting media type for Filename: b'A38822.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO Detecting media type for Filename: b'A19997.xml' INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) INFO rmeta/text (autodetecting type) A04242 txt/../pos/A04242.pos A14326 txt/../pos/A14326.pos A04242 txt/../ent/A04242.ent A14326 txt/../ent/A14326.ent A04242 txt/../wrd/A04242.wrd A14326 txt/../wrd/A14326.wrd A20030 txt/../pos/A20030.pos A16679 txt/../pos/A16679.pos A38822 txt/../pos/A38822.pos A87472 txt/../pos/A87472.pos === file2bib.sh === id: A04242 author: James I, King of England, 1566-1625. title: A counterblaste to tobacco date: 1604 pages: extension: .xml txt: ./txt/A04242.txt cache: ./cache/A04242.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 11 resourceName b'A04242.xml' A20030 txt/../ent/A20030.ent A16679 txt/../ent/A16679.ent === file2bib.sh === id: A14326 author: Venner, Tobias, 1577-1660. title: A briefe and accurate treatise, concerning, the taking of the fume of tobacco vvhich very many, in these dayes, doe too too licentiously vse. In which, the immoderate, irregular, and vnseasonable vse thereof is reprehended, and the true nature and best manner of vsing it, perspicuously demonstrated. By Tobias Venner, Doctor of Physicke in Bath, in the spring and fall, and at other times, in the borough of North Petherton neare to the ancient hauen towne of Bridge-water in Somersetshire. date: 1621 pages: extension: .xml txt: ./txt/A14326.txt cache: ./cache/A14326.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 11 resourceName b'A14326.xml' A16679 txt/../wrd/A16679.wrd A87472 txt/../ent/A87472.ent A20030 txt/../wrd/A20030.wrd A38822 txt/../ent/A38822.ent A38822 txt/../wrd/A38822.wrd A87472 txt/../wrd/A87472.wrd A19997 txt/../pos/A19997.pos A01425 txt/../ent/A01425.ent === file2bib.sh === id: A20030 author: Marbecke, Roger, 1536-1605. title: A defence of tabacco vvith a friendly answer to the late printed booke called Worke for chimny-sweepers, &c. date: 1602 pages: extension: .xml txt: ./txt/A20030.txt cache: ./cache/A20030.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 138 resourceName b'A20030.xml' A01425 txt/../pos/A01425.pos A01425 txt/../wrd/A01425.wrd === file2bib.sh === id: A16679 author: Brathwaite, Richard, 1588?-1673. aut title: A solemne ioviall disputation, theoreticke and practicke; briefely shadowing the lavv of drinking together, with the solemnities and controversies occurring: fully and freely discussed according to the civill lavv. Which, by the permission, priviledge and authority, of that most noble and famous order in the Vniversity of Goddesse Potina; Dionisius Bacchus being then president, chiefe gossipper, and most excellent governour, Blasius Multibibus, aliàs Drinkmuch ... hath publikely expounded to his most approved and improved fellow-pot-shots; touching the houres before noone and after, usuall and lawfull. ... Faithfully rendred according to the originall Latine copie. date: 1617 pages: extension: .xml txt: ./txt/A16679.txt cache: ./cache/A16679.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 132 resourceName b'A16679.xml' === file2bib.sh === id: A38822 author: Everard, Giles. title: Panacea, or, The universal medicine being a discovery of the wonderfull vertues of tobacco taken in a pipe : with its operation and use both in physick and chyrurgery / by Dr Everard, &c. date: 1659 pages: extension: .xml txt: ./txt/A38822.txt cache: ./cache/A38822.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 139 resourceName b'A38822.xml' === file2bib.sh === id: A01425 author: Gallobelgicus. title: Wine, Beer, and Ale Together by the Ears date: 1630 pages: extension: .xml txt: ./txt/A01425.txt cache: ./cache/A01425.xml Content-Encoding ISO-8859-1 Content-Type text/plain; charset=ISO-8859-1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.csv.TextAndCSVParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 11 resourceName b'A01425.xml' A19997 txt/../wrd/A19997.wrd A19997 txt/../ent/A19997.ent === file2bib.sh === id: A87472 author: Everard, Giles. De herba panacea. English. Selections. 1676. title: The touchstone, or, Trial of tobacco whether it be good for all constitutions : with a word of advice against immoderate drinking and smoaking : likewise examples of some that have drunk their lives away, and died suddenly : with King Jame's [sic] opinion of tobacco, and how it came first into England : also the first original of coffee : to which is added, witty poems about tobacco and coffe [sic] : something about tobacco, written by George Withers, the late famous poet ... date: 1676 pages: extension: .xml txt: ./txt/A87472.txt cache: ./cache/A87472.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 142 resourceName b'A87472.xml' === file2bib.sh === id: A19997 author: Deacon, John, 17th cent. title: Tobacco tortured, or, The filthie fume of tobacco refined shewing all sorts of subiects, that the inward taking of tobacco fumes, is very pernicious vnto their bodies; too too profluuious for many of their purses; and most pestiferous to the publike state. Exemplified apparently by most fearefull effects: more especially, from their treacherous proiects about the Gun-powder Treason; from their rebellious attempts of late, about their preposterous disparking of certaine inclosures: as also, from sundry other their prodigious practices. ... date: 1616 pages: extension: .xml txt: ./txt/A19997.txt cache: ./cache/A19997.xml Content-Type application/xml X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.xml.DcXMLParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 191 resourceName b'A19997.xml' Done mapping. Reducing subject-smoking-freebo === reduce.pl bib === id = A38822 author = Everard, Giles. title = Panacea, or, The universal medicine being a discovery of the wonderfull vertues of tobacco taken in a pipe : with its operation and use both in physick and chyrurgery / by Dr Everard, &c. date = 1659 pages = extension = .xml mime = application/xml words = 28417 sentences = 8180 flesch = 97 summary = Textual changes and metadata enrichments aim at making the text more computationally tractable, easier to read, and suitable for network-based collaborative curation by amateur and professional end users from many walks of life. Panacea, or, The universal medicine being a discovery of the wonderfull vertues of tobacco taken in a pipe : with its operation and use both in physick and chyrurgery / by Dr Everard, &c. Panacea, or, The universal medicine being a discovery of the wonderfull vertues of tobacco taken in a pipe : with its operation and use both in physick and chyrurgery / by Dr Everard, &c. civilwar no Panacea; or The universal medicine, being a discovery of the wonderfull vertues of tobacco taken in a pipe, with its operation and use both Everard, Giles 1659 31702 201 30 0 0 0 0 73 D The rate of 73 defects per 10,000 words puts this text in the D category of texts with between 35 and 100 defects per 10,000 words. cache = ./cache/A38822.xml txt = ./txt/A38822.txt === reduce.pl bib === id = A19997 author = Deacon, John, 17th cent. title = Tobacco tortured, or, The filthie fume of tobacco refined shewing all sorts of subiects, that the inward taking of tobacco fumes, is very pernicious vnto their bodies; too too profluuious for many of their purses; and most pestiferous to the publike state. Exemplified apparently by most fearefull effects: more especially, from their treacherous proiects about the Gun-powder Treason; from their rebellious attempts of late, about their preposterous disparking of certaine inclosures: as also, from sundry other their prodigious practices. ... date = 1616 pages = extension = .xml mime = application/xml words = 75741 sentences = 24670 flesch = 89 summary = Tobacco tortured, or, The filthie fume of tobacco refined shewing all sorts of subiects, that the inward taking of tobacco fumes, is very pernicious vnto their bodies; too too profluuious for many of their purses; and most pestiferous to the publike state. Tobacco tortured, or, The filthie fume of tobacco refined shewing all sorts of subiects, that the inward taking of tobacco fumes, is very pernicious vnto their bodies; too too profluuious for many of their purses; and most pestiferous to the publike state. Exemplified apparently by most fearefull effects: more especially, from their treacherous proiects about the Gun-powder Treason; from their rebellious attempts of late, about their preposterous disparking of certaine inclosures: as also, from sundry other their prodigious practices. Exemplified apparently by most fearefull effects: more especially, from their treacherous proiects about the Gun-powder Treason; from their rebellious attempts of late, about their preposterous disparking of certaine inclosures: as also, from sundry other their prodigious practices. cache = ./cache/A19997.xml txt = ./txt/A19997.txt === reduce.pl bib === id = A16679 author = Brathwaite, Richard, 1588?-1673. aut title = A solemne ioviall disputation, theoreticke and practicke; briefely shadowing the lavv of drinking together, with the solemnities and controversies occurring: fully and freely discussed according to the civill lavv. Which, by the permission, priviledge and authority, of that most noble and famous order in the Vniversity of Goddesse Potina; Dionisius Bacchus being then president, chiefe gossipper, and most excellent governour, Blasius Multibibus, aliàs Drinkmuch ... hath publikely expounded to his most approved and improved fellow-pot-shots; touching the houres before noone and after, usuall and lawfull. ... Faithfully rendred according to the originall Latine copie. date = 1617 pages = extension = .xml mime = application/xml words = 19587 sentences = 6047 flesch = 92 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. Which, by the permission, priviledge and authority, of that most noble and famous order in the Vniversity of Goddesse Potina; Dionisius Bacchus being then president, chiefe gossipper, and most excellent governour, Blasius Multibibus, aliàs Drinkmuch ... The plates, signed by William Marshall, bear the titles "The lawes of drinking." and "The smoaking age or the life and death of tobacco.". EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. cache = ./cache/A16679.xml txt = ./txt/A16679.txt === reduce.pl bib === id = A14326 author = Venner, Tobias, 1577-1660. title = A briefe and accurate treatise, concerning, the taking of the fume of tobacco vvhich very many, in these dayes, doe too too licentiously vse. In which, the immoderate, irregular, and vnseasonable vse thereof is reprehended, and the true nature and best manner of vsing it, perspicuously demonstrated. By Tobias Venner, Doctor of Physicke in Bath, in the spring and fall, and at other times, in the borough of North Petherton neare to the ancient hauen towne of Bridge-water in Somersetshire. date = 1621 pages = extension = .xml mime = application/xml words = 6545 sentences = 1639 flesch = 86 summary = A briefe and accurate treatise, concerning, the taking of the fume of tobacco vvhich very many, in these dayes, doe too too licentiously vse. A briefe and accurate treatise, concerning, the taking of the fume of tobacco vvhich very many, in these dayes, doe too too licentiously vse. In which, the immoderate, irregular, and vnseasonable vse thereof is reprehended, and the true nature and best manner of vsing it, perspicuously demonstrated. In which, the immoderate, irregular, and vnseasonable vse thereof is reprehended, and the true nature and best manner of vsing it, perspicuously demonstrated. By Tobias Venner, Doctor of Physicke in Bath, in the spring and fall, and at other times, in the borough of North Petherton neare to the ancient hauen towne of Bridge-water in Somersetshire. By Tobias Venner, Doctor of Physicke in Bath, in the spring and fall, and at other times, in the borough of North Petherton neare to the ancient hauen towne of Bridge-water in Somersetshire. cache = ./cache/A14326.xml txt = ./txt/A14326.txt === reduce.pl bib === id = A01425 author = Gallobelgicus. title = Wine, Beer, and Ale Together by the Ears date = 1630 pages = extension = .xml mime = text/plain words = 77407 sentences = 21080 flesch = 99 summary =

This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. This text is an enriched version of the TCP digital transcription A01425 of text S102807 in the English Short Title Catalog (STC 11542). Textual changes and metadata enrichments aim at making the text more computationally tractable, easier to read, and suitable for network-based collaborative curation by amateur and professional end users from many walks of life. Textual changes and metadata enrichments aim at making the text more computationally tractable, easier to read, and suitable for network-based collaborative curation by amateur and professional end users from many walks of life. By T[homas] C[otes] for Iohn Groue, and are to be sold at his shop at Furniuals Inne Gate in Holborne,

A revision of "Wine, beere, and ale, together by the eares", which was attributed to Gallobelgicus.

cache = ./cache/A01425.xml txt = ./txt/A01425.txt === reduce.pl bib === id = A04242 author = James I, King of England, 1566-1625. title = A counterblaste to tobacco date = 1604 pages = extension = .xml mime = application/xml words = 6574 sentences = 1835 flesch = 91 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Users should be aware of the process of creating the TCP texts, and therefore of any assumptions that can be made about the data. Understanding these processes should make clear that, while the overall quality of TCP data is very good, some errors will remain and some readable characters will be marked as illegible. cache = ./cache/A04242.xml txt = ./txt/A04242.txt === reduce.pl bib === id = A20030 author = Marbecke, Roger, 1536-1605. title = A defence of tabacco vvith a friendly answer to the late printed booke called Worke for chimny-sweepers, &c. date = 1602 pages = extension = .xml mime = application/xml words = 21825 sentences = 6505 flesch = 97 summary = This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A defence of tabacco vvith a friendly answer to the late printed booke called Worke for chimny-sweepers, &c. A defence of tabacco vvith a friendly answer to the late printed booke called Worke for chimny-sweepers, &c. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). cache = ./cache/A20030.xml txt = ./txt/A20030.txt === reduce.pl bib === id = A87472 author = Everard, Giles. De herba panacea. English. Selections. 1676. title = The touchstone, or, Trial of tobacco whether it be good for all constitutions : with a word of advice against immoderate drinking and smoaking : likewise examples of some that have drunk their lives away, and died suddenly : with King Jame's [sic] opinion of tobacco, and how it came first into England : also the first original of coffee : to which is added, witty poems about tobacco and coffe [sic] : something about tobacco, written by George Withers, the late famous poet ... date = 1676 pages = extension = .xml mime = application/xml words = 29719 sentences = 8450 flesch = 92 summary = The touchstone, or, Trial of tobacco whether it be good for all constitutions : with a word of advice against immoderate drinking and smoaking : likewise examples of some that have drunk their lives away, and died suddenly : with King Jame's [sic] opinion of tobacco, and how it came first into England : also the first original of coffee : to which is added, witty poems about tobacco and coffe [sic] : something about tobacco, written by George Withers, the late famous poet ... The touchstone, or, Trial of tobacco whether it be good for all constitutions : with a word of advice against immoderate drinking and smoaking : likewise examples of some that have drunk their lives away, and died suddenly : with King Jame's [sic] opinion of tobacco, and how it came first into England : also the first original of coffee : to which is added, witty poems about tobacco and coffe [sic] : something about tobacco, written by George Withers, the late famous poet ... cache = ./cache/A87472.xml txt = ./txt/A87472.txt Building ./etc/reader.txt A01425 A19997 A87472 A01425 A87472 A38822 number of items: 8 sum of words: 265,815 average size in words: 33,226 average readability score: 92 nouns: xml; id="a01425; pc; p; pos="n1; man; men; time; pos="vvi; selfe; capn; smoke; body; pos="n2; nature; fume; cs; part; things; way; matter; reason; >; thing; bodies; good; leaves; themselues; people; bodie; doth; life; word; euery; others; use; persons; nothing; text; water; vse; parts; pos="po; rest; place; head; fumes; sort; heart; tillage verbs: is; be; are; do; was; were; being; make; have; had; id="a01425; say; made; take; said; let; haue; taken; did; put; see; come; lemma="wine; concerning; am; know; selfe; brought; pray; lemma="your; called; according; set; found; become; tell; lemma="i; hath; reg="beer">beere..hauebeereThis keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. This text is an enriched version of the TCP digital transcription A01425 of text S102807 in the English Short Title Catalog (STC 11542). Textual changes and metadata enrichments aim at making the text more computationally tractable, easier to read, and suitable for network-based collaborative curation by amateur and professional end users from many walks of life. Textual changes and metadata enrichments aim at making the text more computationally tractable, easier to read, and suitable for network-based collaborative curation by amateur and professional end users from many walks of life. By T[homas] C[otes] for Iohn Groue, and are to be sold at his shop at Furniuals Inne Gate in Holborne,

A revision of "Wine, beere, and ale, together by the eares", which was attributed to Gallobelgicus.

id: A04242 author: James I, King of England, 1566-1625. title: A counterblaste to tobacco date: 1604 words: 6574 sentences: 1835 pages: flesch: 91 cache: ./cache/A04242.xml txt: ./txt/A04242.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). Users should be aware of the process of creating the TCP texts, and therefore of any assumptions that can be made about the data. Understanding these processes should make clear that, while the overall quality of TCP data is very good, some errors will remain and some readable characters will be marked as illegible. id: A20030 author: Marbecke, Roger, 1536-1605. title: A defence of tabacco vvith a friendly answer to the late printed booke called Worke for chimny-sweepers, &c. date: 1602 words: 21825 sentences: 6505 pages: flesch: 97 cache: ./cache/A20030.xml txt: ./txt/A20030.txt summary: This keyboarded and encoded edition of the work described above is co-owned by the institutions providing financial support to the Early English Books Online Text Creation Partnership. A defence of tabacco vvith a friendly answer to the late printed booke called Worke for chimny-sweepers, &c. A defence of tabacco vvith a friendly answer to the late printed booke called Worke for chimny-sweepers, &c. EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org). id: A14326 author: Venner, Tobias, 1577-1660. title: A briefe and accurate treatise, concerning, the taking of the fume of tobacco vvhich very many, in these dayes, doe too too licentiously vse. In which, the immoderate, irregular, and vnseasonable vse thereof is reprehended, and the true nature and best manner of vsing it, perspicuously demonstrated. By Tobias Venner, Doctor of Physicke in Bath, in the spring and fall, and at other times, in the borough of North Petherton neare to the ancient hauen towne of Bridge-water in Somersetshire. date: 1621 words: 6545 sentences: 1639 pages: flesch: 86 cache: ./cache/A14326.xml txt: ./txt/A14326.txt summary: A briefe and accurate treatise, concerning, the taking of the fume of tobacco vvhich very many, in these dayes, doe too too licentiously vse. A briefe and accurate treatise, concerning, the taking of the fume of tobacco vvhich very many, in these dayes, doe too too licentiously vse. In which, the immoderate, irregular, and vnseasonable vse thereof is reprehended, and the true nature and best manner of vsing it, perspicuously demonstrated. In which, the immoderate, irregular, and vnseasonable vse thereof is reprehended, and the true nature and best manner of vsing it, perspicuously demonstrated. By Tobias Venner, Doctor of Physicke in Bath, in the spring and fall, and at other times, in the borough of North Petherton neare to the ancient hauen towne of Bridge-water in Somersetshire. By Tobias Venner, Doctor of Physicke in Bath, in the spring and fall, and at other times, in the borough of North Petherton neare to the ancient hauen towne of Bridge-water in Somersetshire. ==== make-pages.sh questions Traceback (most recent call last): File "/ocean/projects/cis210016p/shared/reader-compute/reader-classic/bin/tsv2htm-questions.py", line 23, in df = pd.read_csv( tsv, sep='\t' ) File "/ocean/projects/cis210016p/shared/anaconda/lib/python3.8/site-packages/pandas/io/parsers.py", line 686, in read_csv return _read(filepath_or_buffer, kwds) File "/ocean/projects/cis210016p/shared/anaconda/lib/python3.8/site-packages/pandas/io/parsers.py", line 458, in _read data = parser.read(nrows) File "/ocean/projects/cis210016p/shared/anaconda/lib/python3.8/site-packages/pandas/io/parsers.py", line 1196, in read ret = self._engine.read(nrows) File "/ocean/projects/cis210016p/shared/anaconda/lib/python3.8/site-packages/pandas/io/parsers.py", line 2155, in read data = self._reader.read(nrows) File "pandas/_libs/parsers.pyx", line 847, in pandas._libs.parsers.TextReader.read File "pandas/_libs/parsers.pyx", line 862, in pandas._libs.parsers.TextReader._read_low_memory File "pandas/_libs/parsers.pyx", line 918, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 905, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas/_libs/parsers.pyx", line 2042, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: EOF inside string starting at row 121 ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel