Digital Corpora and Scholarly Editions of Latin Texts: Features and Requirements of Textual Criticism Digital Corpora and Scholarly Editions of Latin Texts: Features and Requirements of Textual Criticism By Franz Fischer Introduction Digital philology has produced a wide range of new methods and formats for ed- iting and analyzing medieval texts. The provision of digital facsimiles has put the manuscripts, the very material base of any editorial endeavor, into focus again. Sev- eral editions have been created that engage primarily with individual manuscripts; others have posited a wide range of variance as a central characteristic of medieval literature instead of relegating variants to the footnotes of ahistorically normalized and regularized texts or speculative reconstructions of archetypes and authorities.1 Nevertheless, the idea of a critical text, especially of nonvernacular medieval works, does not yet seem to be obsolete. Quite the opposite: the number of digital facsimiles of manuscripts and early print books and the quantity of document-oriented tran- scriptions available online is growing continually, and with it the need for critically examined and edited texts increases.2 Like a medieval reader having little choice but to rely on the only manuscript copy available at her or his library, without a critical text the modern reader is at a loss to adjudicate on the quality of the textual version picked up randomly on the internet. Moreover, digital technologies, methods, and standards have steadily improved, creating possibilities for digital critical editions the quality of which former generations of editors could only imagine. As of yet only a rel- atively small number of born-digital critical editions of Greek and Latin texts exists.3 Speculum 92/S1 (October 2017). © 2017 by the Medieval Academy of America. All rights reserved. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0), which permits non-commercial reuse of the work with attribution. For commercial use, contact journalpermissions@press.uchicago.edu. DOI: 10.1086/693823, 0038-7134/2017/92S1-0011$10.00. This article stems from a specialized seminar at the University of Oklahoma on “Latin Textual Crit- icism in the Digital Age” organized by the Digital Latin Library (DLL), a joint project of the Society for Classical Studies, the Medieval Academy of America, and the Renaissance Society of America funded by the Andrew W. Mellon Foundation’s Scholarly Communications Program. 1 E.g., editions of Parzival (http://www.parzival.unibe.ch), the Canterbury Tales, Dante’s Divina Com- media and Monarchia (http://www.sd-editions.com), or the Vercelli Book (http://www.collane.unito.it /oa/items/show/11), to name just a few. All URLs have been verified and the referenced websites have been archived as far as possible in the Internet Archive (https://archive.org/) on 26 June 2017. 2 Franz Fischer, “All Texts Are Equal, But . . . Textual Plurality and the Critical Text in Digital Schol- arly Editions,” Variants 10 (2012): 77–92; online: http://kups.ub.uni-koeln.de/5056; Caroline Macé and Jost Gippert, Oxford Handbook of Greek and Latin Textual Criticism, ed. Wolfgang de Melo and Scott Scullion (Oxford, forthcoming), ch. 6, “Textual Criticism and Editing in the Digital Age.” 3 Paolo Monella, “Why Are There No Comprehensively Digital Scholarly Editions of Classical Texts?” (paper first published online April 2012; revised version [April 2014] online at http://www1.unipa.it/paolo .monella/lincei/files/why/why_paper.pdf). This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://www.parzival.unibe.ch http://www.sd-editions.com http://www.collane.unito.it/oa/items/show/11 http://www.collane.unito.it/oa/items/show/11 http://www.collane.unito.it/oa/items/show/11 https://archive.org/ http://kups.ub.uni-koeln.de/5056 http://www1.unipa.it/paolo.monella/lincei/files/why/why_paper.pdf http://www1.unipa.it/paolo.monella/lincei/files/why/why_paper.pdf http://www1.unipa.it/paolo.monella/lincei/files/why/why_paper.pdf http://www1.unipa.it/paolo.monella/lincei/files/why/why_paper.pdf S266 Digital Corpora and Scholarly Editions of Latin Texts Even so, the (albeit slowly) growing number of digital critical editions increases the demand for assembling and providing critical texts that are in the form of a textual corpus, because only collections or corpora of texts that are otherwise dispersed on various websites allow for a systematic analysis and for efficient research across the works of a specific author, genre, subject, period, or language as a whole.4 In this article, some features and requirements for a digital corpus of critical texts are pro- posed and discussed in order to realize the heuristic, explorative, and interpretative potential of integrated historical texts from the classicist and postclassicist tradition of Greek and Latin works. Generally speaking, when corpora of classical or medieval Latin or Greek texts are compiled and published, they are stripped of their critical features, namely the accompanying introduction, commentary, and apparatus notes. One reason for this omission might be economic: if the texts are published by a traditional publishing house (such as Brepols, with its Library of Latin Texts5), the digital text versions of the corpus are considered an additional means of entry to the printed version in or- der to give access to a large variety of texts and promote the canonical print prod- ucts, which remain indispensable for accurate citation and reference. If the texts are published by an academic institution not primarily driven by eco- nomic interests (such as, most notably, the Perseus Digital Library6 or the Digital Li- brary of Late-Antique Latin Texts7), the reason for skipping the critical features of a printed scholarly edition might be more practical in nature. While it is rather easy to digitize plain texts, it is very hard to encode the complex and often idiosyncratic reference system of apparatus notes (lines, lemmata, variant readings, sigla, etc.). This task requires both a lot of time and a high degree of skill on the part of the dig- itizing person.8 4 On the general aspects and purposes of digital corpora see the catalog of “Criteria for Reviewing Digital Text Collections,” by Ulrike Henny and Frederike Neuber in collaboration with the members of the Institut für Dokumentologie und Editorik (IDE), version 1.0, February 2017, http://www.i-d-e.de /publikationen/weitereschriften/criteria-text-collections-version-1-0/: “A few examples for collection de- sign principles are completeness (e.g. if the corpus aims to represent the work of an author as a whole), representativeness (if the corpus claims to be representative for a specific subject domain and functions as a reference for that domain) and balance (e.g. if the corpus is built to allow for contrastive analyses between its components such as different text genres or regional language varieties).” 5 Library of Latin Texts–Online (LLT-O, 2016), online: http://www.brepols.net/Pages/BrowseBySeries .aspx?TreeSeriespLLT-O. 6 There are some exceptions to the rule of stripping away features of textual criticism, for example, in the edition of Cicero’s speeches, M. Tulli Ciceronis Orationes. See, for example, Against Catiline, work URI: http://data.perseus.org/texts/urn:cts:latinLit:phi0474.phi013; there you also find commentary notes, a translation, a vocabulary tool, and a search tool. For the time being, the only digital corpus of Latin texts providing (mostly retrodigitized) critical editions is “Musisque Deoque: A Digital Archive of Latin Poetry, from Its Origins to the Italian Renaissance,” http://www.mqdq.it/public/. 7 Digital Library of Late-Antique Latin Texts (digilibLT): http://digiliblt.lett.unipmn.it. 8 For a semiautomated method of mapping apparatus entries on the annotated section of the main text, see Federico Boschetti, “Methods to Extend Greek and Latin Corpora with Variants and Conjec- tures: Mapping Critical Apparatuses onto Reference Text,” in Proceedings of the Corpus Linguistics Conference (Birmingham, 2007), online: http://ucrel.lancs.ac.uk/publications/CL2007/paper/150_Paper .pdf. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://www.i-d-e.de/publikationen/weitereschriften/criteria-text-collections-version-1-0/ http://www.i-d-e.de/publikationen/weitereschriften/criteria-text-collections-version-1-0/ http://www.brepols.net/Pages/BrowseBySeries.aspx?TreeSeries=LLT-O http://www.brepols.net/Pages/BrowseBySeries.aspx?TreeSeries=LLT-O http://data.perseus.org/texts/urn:cts:latinLit:phi0474.phi013 http://www.mqdq.it/public/ http://digiliblt.lett.unipmn.it http://ucrel.lancs.ac.uk/publications/CL2007/paper/150_Paper.pdf http://ucrel.lancs.ac.uk/publications/CL2007/paper/150_Paper.pdf Digital Corpora and Scholarly Editions of Latin Texts S267 There are other causes for the omission of text-critical features, such as copyright issues9 or a predominant interest in simple text analytics and computational meth- ods, such as stylometry, topic modeling, computational semantics, text mining, or search and retrieval applied to plain text versions.10 Be that as it may, one might ask whether it would be sufficient simply to add the information as given in the appa- ratus criticus and in the philological introduction to make these texts “truly” digital critical editions. A “truly” and fully fledged digital scholarly edition is surely some- thing more than, or at least something different from, a traditional scholarly edition in a digital format.11 But if that is the case, how does this fit into a corpus of digital scholarly editions? Digital Critical Editions: Six Case Studies In the following analysis, six editions will be presented. They are all critical and digital editions of Latin or Greek works. They have been or are being created in con- nection with my personal and institutional involvement under very specific condi- tions, at a certain place and time, with very specific aims and scope. They serve here as case studies to identify some general characteristics of digital critical editions. On the basis of these examples, four proposals will be made for how to create a digital corpus of critical editions. First Study: Historians from Late Antiquity The collection and edition of fragments and testimonies of historians from late antiquity is a long-term project carried out at the University of Düsseldorf. It has been conceived as a traditional critical print edition with a parallel online presence. The edition comprises a critical text furnished with an apparatus criticus and a philological introduction. A commentary, German translation, and bibliography are planned to be published exclusively in print—as a concession to the business model of the publisher. The online version is being realized by the Cologne Center for eHumanities (CCeH) of the University of Cologne. The critical texts are edited 9 The copyright status of edited ancient or medieval texts varies according to national legislation. For instance, under German law, a critical text of an edition (created by an author deceased centuries ago) might not be copyrighted, while the introduction, commentary, and apparatus are. Otherwise there is legal uncertainty, and uniform international guidelines or legal assistance are missing. See a recent ar- ticle by Wout Dillen and Vincent Neyt, “Digital Scholarly Editing within the Boundaries of Copyright Restrictions,” in Digital Scholarship in the Humanities 31/4 (2016): 785–96, doi:10.1093/llc/fqw011, on the possibilities and limitations when working with modern manuscripts. 10 Good examples for advanced corpora of Latin texts created for this purpose are the Corpus Corporum, a “Latin text (meta-)repository and tool” developed at the University of Zurich (http:// www.mlat.uzh.ch/MLS/); and the Computational Historical Semantics (CompHistSem) Latin text da- tabase and lexicon created at the Goethe-University Frankfurt (http://www.comphistsem.org). For a discussion about the gap between digital scholarly editions and text analysis see the panel discussion “Text Analysis Meets Text Encoding” at the DH2012 conference in Hamburg: http://www.dh2012 .uni-hamburg.de/conference/programme/abstracts/text-analysis-meets-text-encoding.1.html. 11 For a definition see, most recently, Patrick Sahle, “What Is a Scholarly Digital Edition (SDE)?,” in Digital Scholarly Editing: Theory, Practice and Future Perspectives, ed. Matthew Driscoll and Elena Pierazzo (Cambridge, UK, 2016), 19–39; online: http://www.openbookpublishers.com//download/book/527, doi:10.11647/OBP.0095. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://dx.doi.org/10.1093/llc/fqw011 http://www.mlat.uzh.ch/MLS/ http://www.mlat.uzh.ch/MLS/ http://www.comphistsem.org http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/text-analysis-meets-text-encoding.1.html http://www.dh2012.uni-hamburg.de/conference/programme/abstracts/text-analysis-meets-text-encoding.1.html http://www.openbookpublishers.com//download/book/527 http://dx.doi.org/10.11647/OBP.0095 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.11647%2FOBP.0095.02&citationId=p_n_18 S268 Digital Corpora and Scholarly Editions of Latin Texts with Classical Text Editor (CTE),12 a software tool widely used by traditional phi- lologists for creating multiple apparatus in printable format, namely PDF. The tool also provides an HTML and even TEI-XML output, marking up all relevant layout information of the print version: sections, fonts, italics, borders, spaces, and so on. Semantic information (such as readings, witnesses, lemmata, quotes, sigla, and ref- erences) is not marked up explicitly. As a consequence, the digital version is a mere reproduction of the print, lacking any additional features except for basic browse and search. For this reason, it can be labeled a critical edition, as it provides a phil- ological introduction and critical annotations (even if based on the work of previ- ous editors), descriptive information, and indices, as well as—after a so-called mov- ing wall, that is, after a certain period of time—commentary and translation. In essence, the edition follows the print paradigm. Digital methods or functionalities have not been applied. Its usability does not significantly differ from the usability of a printed book. Even if critically annotated and digitally presented, from a techno- logical perspective the established texts are plain and single-dimensional (Fig. 1).13 Second Study: Saint Patrick’s “Confessio” The digital edition of Saint Patrick’s Confessio, a fifth-century open letter by Ireland’s patron saint, is based on a critical print edition from 1950 including crit- ical apparatus, apparatus fontium, apparatus biblicus, and commentary, but also adding various text layers (facsimiles, translations) and features (paratexts, bibli- ography, scholarly articles, fiction, and more)—all of which are closely interlinked and furnished with user-friendly functionalities (hyperlinks from sigla to facsimile, from lemma to text, from reference to bibliography, and so on).14 The realization of the edition entailed a wide range of tasks and actions: OCR cleanup; the acqui- sition of facsimiles; copyright negotiations; encoding of the canonical work struc- ture and alignment with the structure of manuscript witnesses, prints, and trans- lations; and, last but not least, a detailed encoding of the apparatus entries and the editor’s commentary. The presentation of various textual layers, versions, and an- notations relies heavily on the application of hypertext technology and is suitably labeled a hypertext stack edition (Fig. 2).15 Third Study: Guillelmus Autissiodorensis The digital editio princeps of William of Auxerre’s treatise on liturgy, the Summa de officiis ecclesiasticis,16 has been generated from a detailed transcription of the prin- 12 Classical Text Editor, version 9.2 (2016): http://cte.oeaw.ac.at/?id0pmain. 13 A similarly “flat” edition (from the technological point of view) is Donald J. Mastronardo’s digital edition of the scholia on Euripides: http://euripidesscholia.org/. 14 Saint Patrick’s Confessio, ed. Anthony Harvey and Franz Fischer (Dublin, 2011); online: http:// confessio.ie; Franz Fischer, “Who is Patrick?—Answers from the Saint Patrick’s Confessio Hyper- Stack,” in Conference Proceedings: Supporting Digital Humanities (Copenhagen, 2011); online: http:// kups.ub.uni-koeln.de/id/eprint/5054; Fischer, “All Texts Are Equal.” 15 A comparable edition (if on a slightly smaller scale) is the edition of the Schedula diversarum artium (http://schedula.uni-koeln.de/), providing all relevant texts and documents to assess and analyze the complex stages of editorial revision and textual transmission. In the form of a digital collection of three critical print editions, that edition might even be labeled a metaedition. 16 Magistri Guillelmi Autissiodorensis Summa de officiis ecclesiasticis, ed. Franz Fischer (Cologne, 2007–12); online: http://guillelmus.uni-koeln.de; Franz Fischer, “The Pluralistic Approach—The First Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://cte.oeaw.ac.at/?id0=main http://euripidesscholia.org/ http://confessio.ie http://confessio.ie http://kups.ub.uni-koeln.de/id/eprint/5054 http://kups.ub.uni-koeln.de/id/eprint/5054 http://kups.ub.uni-koeln.de/id/eprint/5054 http://schedula.uni-koeln.de/ http://guillelmus.uni-koeln.de Digital Corpora and Scholarly Editions of Latin Texts S269 cipal manuscript witness, includes variant readings from a selection of other wit- nesses, and is enriched with critical editorial markup. Published in 2007, it is the first of its kind in medieval Latin philology, as it follows a pluralistic textual paradigm and provides a critical text with a threefold apparatus, links to all facsimiles on the page level, extensive descriptions of the manuscripts, a detailed transcript of the principal manuscript witness, a reading text of an almost-contemporary revision of the text, an introduction, indices, and so forth. Applying a digital methodology and addressing a wide range of notions of text, this edition might be labeled a born-digital, multi-dimensional, or pluralistic scholarly edition (Fig. 3). Fourth Study: Carolingian Capitularies The Capitularia project provides transcriptions of important law texts from the Carolingian era: collections of decrees of Frankish rulers regulating political, mil- Fig. 1. A critical text version of testimonia on Asinius Quadratus (preview of the KfHist beta version). Scholarly Edition of William of Auxerre’s Treatise on Liturgy,” Jahrbuch für Computerphilologie 10 (2010): 151–68;online:http://computerphilologie.tu-darmstadt.de/jg08/fischer.html; Fischer,“AllTexts Are Equal.” Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://computerphilologie.tu-darmstadt.de/jg08/fischer.html S270 Digital Corpora and Scholarly Editions of Latin Texts itary, ecclesiastical, social, economic, and cultural matters, usually drawn up and is- sued during the course of royal assemblies and distributed by so-called missi, counts and bishops. Previous critical editions published in print all failed to reflect ade- quately the diversity and complexity of the textual transmission. In a new editorial approach, all manuscript witnesses are being transcribed with a focus on structural information, such as rubrics, initials, and the order of chapters and capitularies. This serves the twofold aim of respecting the individual and regional characteristics of each of these historical documents and enabling a semiautomated comparison for detecting and highlighting differences and commonalities among the witnesses (Fig. 4). These automated collations, made using the collation tool CollateX,17 constitute the basis for a critical assessment of the textual tradition and for establishing a crit- ical text version to be published both in print and online as part of the Monumenta Fig. 2. The first paragraph of Saint Patrick’s Confessio, with interlinked entries of the three- fold apparatus and links to manuscript facsimiles, previously relevant print editions, and trans- lations. 17 CollateX—Software for Collating Textual Sources: http://collatex.net/. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://collatex.net/ Digital Corpora and Scholarly Editions of Latin Texts S271 Germaniae Historica (MGH and dMGH, respectively).18 Aiming to document both the full textual transmission and a critical text and following a twofold publication strategy, this edition might be labeled a multiwitness hybrid edition (Fig. 5). Fifth Study: Monasterium.net Monasterium.net is a collaborative and virtual digital archive, presently provid- ing access to facsimiles and descriptions of more than six hundred thousand me- Fig. 3. The chapter on the Third Hour in William of Auxerre’s Summa de officiis ecclesias- ticis, critical text with threefold apparatus and links to manuscript facsimiles and other text versions. 18 See Gioele Barabucci and Franz Fischer, “The Formalization of Textual Criticism: Bridging the Gap between Automated Collation and Edited Critical Texts,” in Advances in Digital Scholarly Edit- ing, ed. Peter Boot et al. (Leiden, forthcoming). Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). S272 Digital Corpora and Scholarly Editions of Latin Texts dieval and early modern charters from more than one hundred and fifty archives. The online platform allows for digital editing of the charters at all scholarly levels: in some instances, scans are provided, along with the most basic metadata, such as repository and shelf marks; in others, short descriptions and abstracts are included and, if available, retrodigitized print editions; whereas in others, veritable born- digital diplomatic editions are produced that include introductions or prefaces, diplomatic transcripts encoded according to the standard of the Charters Encoding Initiative (CEI), a diplomatic analysis, and bibliographies. Since charters usually survive as single documents, there is no critical annotation in the form of critical ap- paratus entries. The nature of these charter editions varies and ranges from digital diplomatic editions in their original sense, that is, focusing on dating, proof of au- thenticity, and the analysis of the content structure of a charter;19 digital documentary Fig. 4. A collation table of various witnesses generated by the collation tool CollateX, im- plemented into the Capitularia website (internal). 19 According to the definition given in the Vocabulaire international de la diplomatique, ed. Maria Milagros Cárcel Ortí, 2nd ed. (Valéncia, 1997), 24; online: http://www.cei.lmu.de/VID/VID#VID_19): “Une édition diplomatique est la publication d’un document, après établissement critique de son texte Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://www.cei.lmu.de/VID/VID#VID_19 Digital Corpora and Scholarly Editions of Latin Texts S273 editions, focusing on external features of the documents; and data-enriched editions, with information on historical persons, places, events, or decoration (for example, in the art historical subcollection of illuminated charters)20 (Fig. 6). Sixth Study: Digital Averroes Research Environment (DARE) The Digital Averroes Research Environment (DARE) collects and edits the works of the Andalusian philosopher Averroes (Abū l-Walı̄d Muh�ammad Ibn Ah�mad Ibn Rušd), born in Cordoba in 1126, died in Marrakesh in 1198. Through the portal, images of as many textual witnesses as possible, that is, manuscripts, incunabula, Fig. 5. Online edition of the Frankish Capitularies, transcription of the Parisian manuscript witness Bibliothèque nationale de France, MS lat. 9653. 20 See http://www.monasterium.net/mom/IlluminierteUrkunden/collection; http://www.monasterium .net/mom/glossar. compte tenu de la tradition de celui-ci et d’un examen critique de sa sincérité et de sa datation.” The term was established in the seventeenth century during the historical debate between the Maurist scholar Jean Mabillon and the Bollandist hagiographer Daniel van Papenbroeck: see Paul Bertrand, “Du De re diplo- matica au nouveau traité de diplomatique: La réception des textes fondateurs d’une discipline,” in Dom Jean Mabillon, figure majeure de l’Europe des lettres: Actes des deux colloques du tricentenaire de la mort de dom Mabillon, ed. Jean Leclant, André Vauchez, and Daniel-Odon Hurel (Paris, 2010), 605–19. Now- adays the term “diplomatic” is usually applied to very detailed transcriptions of any type of document: see Lexicon of Scholarly Editing, ed. Wout Dillen et al., s.v. “transcription (diplomatic),” http://uahost .uantwerpen.be/lse/index.php/lexicon/diplomatic-transcription/. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://www.monasterium.net/mom/IlluminierteUrkunden/collection http://www.monasterium.net/mom/glossar http://www.monasterium.net/mom/glossar http://uahost.uantwerpen.be/lse/index.php/lexicon/diplomatic-transcription/ http://uahost.uantwerpen.be/lse/index.php/lexicon/diplomatic-transcription/ Fig. 6. Facsimile and transcription of a medieval Serbian charter on monasterium.net: Bari, Archivio di S. Nicola Periodo Angioino L. 22 (20 August 1346, Skopje). This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Digital Corpora and Scholarly Editions of Latin Texts S275 and early printed editions, are provided online.21 At present, DARE includes only a small number of edited texts, most of these textual versions that have not yet been critically annotated. However, the portal is already a key resource for a long-term ed- itorial project to create critical editions of the works of Averroes that reflects and an- alyzes their extremely complex transmission back and forth through Latin, Greek, Arabic, and Hebrew—an enterprise that would have been considered impossible without digital methods and resources. The established critical-text versions will even- tually be integrated into the DARE platform in order to complement a digital re- source that can be labeled a knowledge site (Fig. 7).22 Variety of Editions versus Homogeneity of a Corpus We have just presented six examples of critical approaches towards (mostly) Latin texts in a digital editorial format. They show a great variety with respect to the content and the notion of what the text is and what the respective edition ac- tually should do. Some digital editions (1) provide a critical text following the Lach- mannian paradigm, reconstructing some archetypal text version by following a strict methodology of recensio (transcription, collation, establishment of a stemma codi- cum), selectio, and emendatio.23 Others (2) abide by the Leithandschrift principle and follow a principal manuscript witness. Accurate transcriptions (3) might focus on very different details and characteristics before being enriched with critical an- notations. Nowadays most digital editions provide digital facsimiles of manuscripts and prints, all of which may vary in the quality of the digital scans and in the degree to which they are integrated into and interlinked with the critical text. Some editions are multidimensional, providing various versions or layers of text, parallel texts, and trans- lations. All digital editions are labeled according to the material and the editorial method applied: critical, diplomatic, semidiplomatic, documentary, multiwitness, ar- chive edition, and so on. Moreover, even editions with similar labels feature various differing functionalities and presentational modes, all of which are based on a large variety of encoding, since even within the de facto standard for text encoding, as pro- vided by the guidelines of the Text Encoding Initiative (TEI),24 there are various ways of modeling textual variance. More generally speaking, digital scholarly editions all differ with respect to the application and degree of both textual criticism and dig- itality (that is, the degree to which they employ and integrate digital technologies). But if textual, or rather editorial, plurality seems to be one of the main charac- teristics of digital editions, how is a coherent digital corpus of scholarly editions to be constructed? How does such diversity fit into a corpus if the usefulness of a cor- pus is based largely on the homogeneity and representativeness of the texts that it includes? These texts are expected to be homogenous in order to be detectable, 21 For an overview of texts available, see http://dare.uni-koeln.de/?qpnode/32. 22 Peter Shillingsburg, “How Literary Works Exist: Convenient Scholarly Editions,” Digital Human- ities Quarterly 3/3 (2009), par. 4; Thomas Stäcker, “Creating the Knowledge Site—Elektronische Editionen als Aufgabe einer Forschungsbibliothek,” Bibliothek und Wissenschaft 44 (2011): 107–26. 23 Martin Litchfield West, Textual Criticism and Editorial Technique Applicable to Greek and Latin Texts (Stuttgart, 1973). 24 Guidelines of the Text Encoding Initiative (P5): tei-c.org/release/doc/tei-p5-doc/en/html/index.html. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://dare.uni-koeln.de/?q=node/32 http://tei-c.org/release/doc/tei-p5-doc/en/html/index.html https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.1007%2F978-3-663-12401-6&citationId=p_n_37 S276 Digital Corpora and Scholarly Editions of Latin Texts comparable, and analyzable across the whole corpus. Texts that are part of a cor- pus are supposed to be representative for a specific work, genre, or period. Having a variety of versions or textual layers of one specific work is clearly not what suits the idea of a corpus of texts. Even if it were possible to integrate complex digital resources into one portal, the amount of work and expertise needed to maintain a resource of such exponentially increased complexity would seem impracticable, given the pace of ongoing technological and methodological innovations. Fig. 7. Averroes’s commentary on Aristotle’s Physics, translated into Latin by Michael Sco- tus, and a manuscript witness from Assisi (Biblioteca Communale, MS 279, fol. 91v). Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Digital Corpora and Scholarly Editions of Latin Texts S277 Four Proposals to Achieve a Compromise In the following four proposals we shall explore how the two conflicting con- cepts and practices of idiosyncratic digital critical editing on the one hand and cre- ating a homogeneous textual corpus on the other can be reconciled despite the ap- parent contradictions. First Proposal: Digital in a Wide Sense, Critical in a Narrow Sense The first proposal to resolve the conflict between variety of editions and homo- geneity within a corpus is to create and provide editions that are both digital in a wider sense and scholarly in a narrow sense. This proposal can be divided into two strategic approaches: the first approach starts from the definition of “digital,” the second from the definition of “critical.” 1. Digital in a Wide Sense As part of a digital corpus, each individual scholarly edition does not necessar- ily need to be digital in a strict sense. What does “digital edition in a strict sense” mean? According to the “Catalogue of Criteria for Reviewing Scholarly Digital Edi- tions” as issued by the Institute for Documentology and Scholarly Editing (IDE), a scholarly edition is “an information resource which offers a critical representa- tion of (normally) historical documents or texts. Scholarly digital editions are not merely publications in digital form; rather, they are information systems which fol- low a methodology determined by a digital paradigm, just as traditional print edi- tions follow a methodology determined by the paradigms of print culture. Given this narrow understanding of SDEs, many digital resources cannot be considered digital editions in this strict sense.”25 And in an even more apodictic manner, in his most recent article on the subject, Sahle states what can be regarded as common sense among today’s digital humanities scholars: 25 P www 26 S is som analy in all Age, Al • “A digitized edition is not a digital edition.” • “A digital edition cannot be given in print without a significant loss of content and functionality.” • “A digital edition is guided by a digital paradigm in its theory, method, and practice.26” Given these definitions, the point here is exactly the opposite: individual critical editions as part of a corpus need not strictly follow a digital paradigm, which, although desirable, is not a requirement. As demonstrated above, textual plurality and the complexity of the editorial approach towards an edited work is a main characteristic of a fully fledged digital scholarly edition. In contrast, the purpose of a corpus lies in its capacity to provide a large number of homogeneously edited texts, not only to ensure a high degree of usability but also to guarantee its feasibility and long-term maintainability. Therefore in principle these editions can be digitized critical editions. atrick Sahle et al., “Criteria for Reviewing Scholarly Digital Editions,” 2014 (version 1.1), http:// .i-d-e.de/publikationen/weitereschriften/criteria-version-1-1/. ahle, “What is a scholarly digital edition?” According to Tara Andrews a scholarly digital edition ething “beyond a feature-rich electronic book”: “It is the practice of deep and/or large-scale text sis, rather than that of textual criticism itself, which must drive the development of digital editions their potential.” See Tara L. Andrews, “The Third Way: Philology and Critical Edition in the Digital ” Variants 10 (2012): 61–76; postprint online version: http://boris.unibe.ch/43071/. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM l use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://www.i-d-e.de/publikationen/weitereschriften/criteria-version-1-1/ http://www.i-d-e.de/publikationen/weitereschriften/criteria-version-1-1/ http://boris.unibe.ch/43071/ S278 Digital Corpora and Scholarly Editions of Latin Texts Content and functionalities do not have to significantly exceed the content and functionalities of the print edition, that is, on the level of the individual text as part of a corpus, even though, even here, a certain minimum of requirements should be met (see below). However, additional digital value does need to be realized on the level of the entire corpus. What additional digital value across the entire corpus can mean will be discussed under proposal 4 below. 2. Critical in a Narrow Sense—Four Manifestations of Textual Criticism The other half of the first proposal needs to be clarified: create and provide edi- tions that are scholarly in a narrow sense. The term “critical” (even though often used as a synonym for “scholarly”) qualifies the meaning of scholarly, but what pre- cisely does critical mean? Peter Robinson, with his notorious six essential aspects of electronic digital edi- tions, refers with the first three criteria to an essential philological methodology and scholarly rigor.27 According to Robinson, a digital critical edition is anchored in a historical analysis of the materials; presents hypotheses about creation and change; and supplies a record and classification of difference over time, in many dimensions and in appropriate detail. These points are widely accepted by most scholars. This definition and others brought forward by renowned scholars are sup- ported by the wide range of digital scholarly editions currently seen.28 Be this as it may, and whatever the material, methodology, or requirements of a community, in order to make critical editions fit into a digital corpus of homogeneous texts rep- resenting works of Latin literature, the various aspects of textual criticism can be broken down into four basic manifestations of criticism: (1) critical annotation, (2) markup, (3) metadata, and (4) documentation. These essential features of a crit- ical text must be accommodated by any model of a digital corpus, a model defin- ing indispensable requisites and requirements for a text to be incorporated into the corpus. (1) The first manifestation of textual criticism is critical annotation to the text, more specifically, the presence of an apparatus criticus or other means of record- ing textual variants and all justifications for the state of the edited text. In addition, critical annotation might include an apparatus fontium, giving references to sources and paratexts; an apparatus biblicus, as a typical feature of patristic or medieval texts; a commentary with explanatory notes or historical and philological notes, and dis- cursive notes with present-day relevance, such as references to gender issues and so- ciopolitical subject matter. (2) The second manifestation comprises the potentially very deep and extensive markup of the text: structural markup (including identifiers); markup of internal and external references or named entities; linguistic and semantic markup, such as part-of-speech tagging; lemmatization or syntactical markup; markup of typical 27 The fourth criterion mentions the presentation of an “edited” text (only) as an option; the fifth and sixth criteria refer to digital usability: see Peter Robinson, “What Is an Electronic Critical Edition?,” Variants 1 (2002): 51–57. 28 Daniel Apollon and Claire Bélisle, “The Digital Fate of the Critical Apparatus,” in Digital Critical Editions, ed. Daniel Apollon, Claire Bélisle, and Philippe Régnier (Urbana, 2014), 81–113, here esp. 86; Elena Pierazzo, Digital Scholarly Editing: Theories, Models and Method (Farnham, Surrey, 2015); Patrick Sahle, Digitale Editionsformen: Zum Umgang mit der Überlieferung unter den Bedingungen des Me- dienwandels, 3 vols. (Norderstedt, 2013), here esp. 2:125–57. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Digital Corpora and Scholarly Editions of Latin Texts S279 features of an apparatus entry, such as sigla, references, or quotes and readings. It might also include markup of the types of apparatus entries according to catego- ries29 such as textual,30 intertextual,31 exegetical, rhetorical,32 and metrical.33 (3) The third manifestation of textual criticism comprises all kinds of metadata and structured information on the author, the work, and the edition itself,34 that is, bibliographical information concerning the work itself, including its genre, dates, appropriate keywords, and so forth; as well as imaging parameters, responsibili- ties, licenses, and so on in regard to the edition; and contextual information in the form of a “critical bibliography.” Ideally, all this information is given in a standard- ized format (such as TEI, METS, Dublin Core, or some other bibliographic stan- dard) with references to authority files (such as GND, VIAF, Getty Thesaurus) for named entities and using taxonomies and ontologies (SKOS, CIDOC CRM) that are relevant for the respective field of research. (4) The fourth manifestation comprises information traditionally provided in a philological introduction, paratexts, and other kinds of accompanying texts and ma- terials, which can all be subsumed under the term “documentation.” Ideally, the ma- terial basis of the edited text is documented by digital facsimiles of manuscript wit- nesses and relevant printed editions. These surrogates should be the result of what has been labeled “critical digitization” in the sense that information is provided about the decisions involved in setting up the parameters for digitizing.35 The manuscripts should then be described thoroughly according to scholarly practice. Where tran- scriptions have been created, these should be included as well as the source code of all manuscript descriptions, transcripts, and the critical text itself. Moreover, it is es- sential to present a historical analysis, hypotheses about the creation of the text, and a record and classification of differences over time.36 Most importantly, however, the editorial principles need to be made explicit. 29 For a discussion on types and categories (and respective taxonomies), see Michael Hendry’s blog post on “Categories of Adversaria” at http://curculio.org/?pp1112 (10 March 2014; Paola Italia, Fabio Vitali, and Angelo di Iorio, “Variants and Versioning between Textual Bibliography and Com- puter Science,” in AIUCD ‘14—Proceedings of the Third AIUCD Annual Conference on Humanities and Their Methods in the Digital Ecosystem, ed. Francesca Tomasi, Roberto Rosselli del Turco, and Anna Maria Tammaro (New York, 2015); doi:10.1145/2802612.2802614; see also TEI-L thread on “Types of edits” started by Christof Schöch (10 May 2016). 30 E.g., variants (substantive, orthographic), conjectures, deletions, obelizations, transpositions, lacu- nae, (marginal or interlinear) additions, punctuation, speaker attribution, structure (e.g., boundaries be- tween books, chapters, paragraphs, poems, stanzas, verses, etc.). 31 E.g., sources, parallels, later usage, reception and Nachleben (modern allusions and imitations). 32 E.g., figures of speech, tropes, style. 33 Cf. “Pede certo—Metrica latina digitale,” software developed by the University of Udine for the automatic analysis of Latin verses: http://www.pedecerto.eu/. 34 A metadata model needs to take into account the various levels of possible entities like those rep- resented in the Functional Requirements for Bibliographic Records (FRBR) model, such as work, ex- pression, manifestation, and item. 35 Mats Dahlström, “Critical Editing and Critical Digitization,” in Text Comparison and Digital Creativity: The Production of Presence and Meaning in Digital Text Scholarship, ed. E. Thoutenhoofd, A. van der Weel, and W. Th. van Peursen (Amsterdam, 2010), 79–97; Mats Dahlström, “Critical Transmission,” in Between Humanities and the Digital, ed. P. Svensson and D. T. Goldberg (Cam- bridge, MA, 2015), 467–81. 36 Robinson, “What Is an Electronic Critical Edition?,” 51–57. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://curculio.org/?p=1112 http://dx.doi.org/10.1145/2802612.2802614 http://www.pedecerto.eu/ https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.1145%2F2802612.2802614&citationId=p_n_49 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.1145%2F2802612.2802614&citationId=p_n_49 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.1145%2F2802612.2802614&citationId=p_n_49 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.1163%2Fej.9789004188655.i-328.29&citationId=p_n_56 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.1163%2Fej.9789004188655.i-328.29&citationId=p_n_56 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.1163%2Fej.9789004188655.i-328.29&citationId=p_n_56 S280 Digital Corpora and Scholarly Editions of Latin Texts Again, the viability and success of a digital corpus of critical texts depends on finding an appropriate and functional overarching data model that is able to ac- commodate these forms of critical annotation and information. To this end, it may be useful to reduce the force of the term “critical” to a rather prosaic meaning and to define an absolute minimum of requirements for the incorporation of a crit- ical text into a digital corpus. Referring to the four manifestations of textual crit- icism described above, this minimum of requirements could be: (Ad 1) The critically constituted text bears all critical information (for example, in the traditional annotation format of an apparatus) required to justify the lin- guistic or philological form of the edited text. (Ad 2) The work structure is clearly defined: entities such as book, chapter, par- agraph, and so on are marked up accordingly in order to fit in with a corpus-wide schema for addresses and the citation of the respective text entities. (Ad 3) Metadata is provided on the author, work, and the edition itself. (Ad 4) The text has sufficient material documentation (manuscript descriptions and facsimiles) and a philological introduction specifying the editorial principles. Defining the texts that are to be included into the corpus as “digital in the wider sense” (that is, not necessarily following a digital paradigm) and as “critical in a nar- row sense” (fulfilling the minimal requirements of the critical textual scholarship) would allow for the inclusion of (a) printed critical editions created with a digitizing process that is not too demanding; (b) existing born-digital critical editions37 with a transformation or spin-off process that is not too complicated; and (c) new born- digital critical editions created within the editorial framework provided by the cor- pus portal (as it is currently planned for the Digital Latin Library).38 Second Proposal: Works Rather Than Documents The second proposal to resolve the conflict between variety of editions and ho- mogeneity within a corpus is to focus on works rather than documents. A text cor- pus is not an archive. Digital editions tend to start from or grow into some sort of digital archive.39 In order to provide texts that are to some extent homoge- neous, the editorial features within a corpus should not focus on contingent and individual material aspects of the text or on paleographic or codicological details. Instead of accumulating textual evidence and transcriptions of witnesses, they should focus on critical value, i.e. critical annotation, deep mark-up and the establishment of 37 The catalogs of existing digital scholarly editions prepared by Patrick Sahle, “A Catalog of Digital Scholarly Editions,” version 3.0, snapshot 2008ff, http://www.digitale-edition.de/; Greta Franzini, “A Catalogue of Digital Editions,” https://github.com/gfranzini/digEds_cat (with a list of further catalogs at https://github.com/gfranzini/digEds_cat/wiki). 38 Digital Latin Library (DLL): http://digitallatin.org/. 39 Patrick Sahle, “Digitales Archiv und Digitale Edition: Anmerkungen zur Begriffsklärung,” in Literatur und Literaturwissenschaft auf dem Weg zu den neuen Medien, ed. Michael Stolz (Zürich, 2007), 64–84; online: http://www.germanistik.ch/scripts/download.php?idpDigitales_Archiv_und_digitale_Edition; Ken- neth Price, “Edition, Project, Database, Archive, Thematic Research Collection: What’s in a Name?,” Dig- ital Humanities Quarterly 3/3 (2009), http://www.digitalhumanities.org/dhq/vol/3/3/000053/000053.html; Dirk van Hulle, “Editie en/of Archief: Moderne manuscripten in een digitale architectuur,” in Verslagen en mededelingen van de Koninklijke Academie voor Nederlandse Taal- en Letterkunde 119/2 (2009): 163–78. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://www.digitale-edition.de/ https://github.com/gfranzini/digEds_cat https://github.com/gfranzini/digEds_cat/wiki http://digitallatin.org/ http://www.germanistik.ch/scripts/download.php?id=Digitales_Archiv_und_digitale_Edition http://www.digitalhumanities.org/dhq/vol/3/3/000053/000053.html Digital Corpora and Scholarly Editions of Latin Texts S281 some kind of representative text version with a canonical work structure. This does not mean that transcriptions and facsimiles etc. should not be included; they should in some way. It is just a matter of prioritizing when creating a digital corpus. Indi- vidual scholarly editions will always have to define their own priorities and tend to emphasize particularities of the textual material and specificities of the individual re- search perspective. The challenge here for future corpora of critical texts is to estab- lish a basic and interchangeable data format to which a required set of data com- ponents of complex editions as described above can be translated, transformed or downgraded. Third Proposal: Leave to Others What Others Do Better Digital editions as part of a corpus cannot and should not be all inclusive. To the contrary: a characteristic of digital editions is the overcoming of the limitations of the publication itself through integration of or, here even more importantly, through linkage to external resources.40 The theory of digital scholarly editing en- visions an all-encompassing model of highly complex, layered, rich information resources. Individual digital editions, however, do not need to provide and main- tain the full range of possible modules, such as high-resolution facsimiles, transla- tions in various languages, all sorts of visualizations, additional contextual material, and user-friendly tools within one clearly delimited and self-contained publication. All these features and information enriching the reading experience and support- ing individual research can hardly be provided and maintained within a single cor- pus. Rather, any additional feature that is not required according to the criteria of the corpus should be outsourced and either referred to via hyperlink or, if pos- sible, embedded from external resources.41 This is especially reasonable with re- gard to authority files; encyclopedic knowledge, as part of online reference works and compendia; paratexts, as part of other digital corpora; and facsimiles. As for the latter, ideally cultural heritage institutions, such as archives and libraries, take care of their own material and provide descriptions, high quality reproductions, and tools to engage with material in a standardized way so that it can be embedded and used by users and editors alike. The embedding of external resources can be realized in two different ways, both of which have advantages and disadvantages. The easiest method from a technical point of view is simply to include a link out of the edition that targets the external resource. An example of the application of this method is the digital edition of the St. Gall Priscian, which links to manuscript images at the Codi- ces Electronici Sangallenses (CESG) Virtual Library (Figs. 8 and 9).42 40 This according to Patrick Sahle is one aspect of overcoming the limitations of print editions (“die Entgrenzung der Publikation”) both quantitatively (with no restrictions on space) and qualitatively (by inclusion of texts, images, audio, video): see “Zwischen Mediengebundenheit und Transmedialisierung: Anmerkungen zum Verhältnis von Edition und Medien,” in Editio 24 (2010): 23–36; doi:10.1515/edit .2010.004. 41 Cf. Joris van Zundert and Peter Boot, “The Digital Edition 2.0 and the Digital Library: Services, Not Resources,” in Bibliothek und Wissenschaft 44 (2011): 141–52; online: http://peterboot.nl/pub /vanzundert-boot-services-not-resources-2011.pdf. 42 St. Gall Priscian glosses, ed. Pádraic Moran, http://www.stgallpriscian.ie/; Codices Electronici Sangallenses (CESG)—Virtual Library, http://www.cesg.unifr.ch/en/index.htm. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://dx.doi.org/10.1515/edit.2010.004 http://dx.doi.org/10.1515/edit.2010.004 http://peterboot.nl/pub/vanzundert-boot-services-not-resources-2011.pdf http://peterboot.nl/pub/vanzundert-boot-services-not-resources-2011.pdf http://www.stgallpriscian.ie/ http://www.cesg.unifr.ch/en/index.htm Fig. 8. St. Gallen, Stiftsbibliothek, MS Cod. Sang. 904, fol. 1r. The digital edition of St. Gall Priscian glosses (on the left), with links to the manuscript images and descriptions at the Codices Electronici Sangallenses (CESG) Virtual Library (on the right). This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). Fig. 9. St. Gallen, Stiftsbibliothek, MS Cod. Sang. 904, fol. 1r. The digital edition of St. Gall Priscian glosses (on the left), with links to the manuscript images and descriptions at the Codices Electronici Sangallenses (CESG) Virtual Library (on the right). Digital Corpora and Scholarly Editions of Latin Texts S283 The integration of external information into the edition itself might be more user- friendly. Images or texts can be either included from the external server or, if restric- tions relating to technical infrastructure or copyrights do not prevent it, mirrored onto a dedicated server. A technically advanced publishing framework has been de- veloped by Jeffrey C. Witt: the LombardPress Web application43 is designed to un- derstand and consume common interfaces (so-called IIIF application programming interfaces44) as adopted by a growing number of leading research libraries with 43 See http://lombardpress.org/web. 44 International Image Interoperability Framework (IIIF): see http://iiif.io. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://lombardpress.org/web http://iiif.io S284 Digital Corpora and Scholarly Editions of Latin Texts manuscript collections in order to allow for the possibility of querying images of manuscript folios directly from library servers across the world (Fig. 10).45 Fourth Proposal: Create Additional Value across the Corpus As pointed out under the first proposal, critical editions as part of a corpus need not be “truly digital” in the sense that they follow a digital paradigm and that they are created applying digital methods. Rather, the fourth proposal advocates the creation of additional value across the whole range of texts through the features and the technical framework of a “truly digital” corpus—based on an elementary data model for metadata, text, annotation, and paratexts. As soon as a suitable and robust data model has been found to accommodate the various forms of textual criticism, additional value can be generated by en- abling a full exploration of the data captured across the entire corpus.46 This ad- ditional value cannot be provided in print editions, and it is characteristic of both individual digital editions and digital text corpora in general. A set of generic and corpus-wide tools, features, and functionalities should ad- dress researchers’ needs and expectations.47 (1) First, the search function is of the highest importance for any digital corpus. It should not only provide a full-text search over all textual material included in the corpus (edited texts, apparatus, introductions, etc.), but also advanced search options, such as searching by logical operators and connectors and allowing for truncation and wildcards. Needless to say, a fuzzy-search function is indispens- able for finding words and strings with orthographic variance within one and the same text as well as across various texts. Ideally, each and every word of the cor- pus is lemmatized to allow queries to match different forms of words, which may include even synonyms.48 In addition to this, metadata allows for faceted search- ing of all kinds. It could be used to search by geographical regions or places of or- igin or provenance; by specific centuries, decades, or years of creation; by genres (like the Thesaurus Linguae Graecae categories of historici, poetae, philosophi, 45 LombardPress-Web builds on the “Scholastic Commentaries and Texts Archive” (SCTA: see http://scta.info/). The SCTA database first points to the ID of a respective codex surface. If the holding library’s image repository is IIIF compliant, the SCTA database will link out further to the ID of the IIIF canvas and from there to the URL of the image itself. For a draft proposal of this SCTA data model see http://lombardpress.org/2016/08/09/surfaces-canvases-and-zones/; about LombardPress in general, see http://lombardpress.org/about/. 46 In the area of linguistic corpora there have been attempts to address the issue of reconciling dif- ferent formats. See, for example, Salt and Pepper at http://corpus-tools.org/. Salt and Pepper are not just methodological recommendations, they are functioning, extensible open source tools that support the integration of linguistic corpora created according to different principles into a larger framework. 47 Cf. Henny and Neuber, “Criteria for Reviewing Digital Text Collections.” There should be also a set of tools, features, and functionalities for the wider public in order to extend the usability of critical editions beyond a scholarly audience. This, however, lies beyond the scope of this article. 48 For an automated form analysis and translation, most advanced digital corpora of Latin texts, such as Perseus, Corpus Corporum, and Computational Historical Semantics (CompHistSem), use spe- cific TreeTragger software as developed and maintained by the Perseus project: The Ancient Greek and Latin Dependency Treebank (AGLDT, http://perseusdl.github.io/treebank_data/). For an overview of current tools for lemmatization and morphological analysis, see the Digital Classicist Wiki: https:// wiki.digitalclassicist.org/Morphological_parsing_or_lemmatising_Greek_and_Latin (last modified on 6 December 2016). Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://scta.info/ http://lombardpress.org/2016/08/09/surfaces-canvases-and-zones/ http://lombardpress.org/about/ http://corpus-tools.org/ http://perseusdl.github.io/treebank_data/ https://wiki.digitalclassicist.org/Morphological_parsing_or_lemmatising_Greek_and_Latin https://wiki.digitalclassicist.org/Morphological_parsing_or_lemmatising_Greek_and_Latin Fig. 10. The Scholastic Commentaries and Texts Archive (SCTA): First distinction of book 4 of the Sentences Commentary by William of Rothwell, edited by Jeffrey C. Witt and pub- lished through LombardPress, here in a diplomatic transcription of a manuscript from Aarau (Aargauer Kantonsbibliothek, MS WettF 15), displaying in the bottom the same paragraph in a manuscript from Copenhagen (Danish Royal Library, MS GKS 1363). This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). S286 Digital Corpora and Scholarly Editions of Latin Texts theologi, oratores, etc.), or by a specific meter.49 Based on the markup, searches could be limited to a certain type or content of apparatus entries (see above). (2) Another essential feature of a text corpus is an elaborated index function. Indices should be generated and interlinked both work-wide and corpus-wide from the metadata (as regards authors, works, genres, periods, keywords, etc.) and from the markup (depending on the encoding schema with respect to named entities, that is, marked-up persons, places, dates, events, etc.), and where the texts are lem- matized, word indices could be provided. Lists of manuscripts should be created ac- cording to the structured information given in the documentation. (3) The third fundamental functionality of a digital corpus is the provision of hyperlinks generated from explicit references, pointers, and identifiers in the markup and metadata. Internal links are to be realized as text-wide (especially connecting text and critical annotations), as work-wide (connecting text, manuscript witnesses, trans- lations, and accompanying material) and as corpus-wide (connecting intertextual references, dictionary entries, registers, and indices). External links might point to digital archives (providing manuscript facsimiles, catalog entries and descriptions, etc.), digital corpora (providing relevant texts and contextual material), digital en- cyclopedias and dictionaries, and to any outsourced or externalized material (forums, audios, videos, blogs, etc.; see above). (4) The aptitude of a digital corpus for scholarly use then completely depends on addressability and citability of all its parts and components, namely of the crit- ical text (according to books, chapters, paragraphs, stanzas, verses, lines, words, and the respective critical annotations) and of the documentation (manuscript de- scriptions, transcripts, and introduction) as well as on the addressability and cita- bility of versions, in case changes have been carried out or a progressive publica- tion mode has been established. If the editorial framework allows for progressive publications, updates, additions, corrections, and so on (which in open software development and in digital humanities research is generally recommended50) this would have an enormous impact on all areas of the corpus. Keeping track of ver- sions is an extremely challenging task, especially if the corpus is supposed to pro- vide canonical text versions that do not change.51 Be that as it may, the data model and publication framework need to make sure that every part, layer, and format 49 Cf. above, n. 33, on “Pede certo.” 50 The “release early, release often” policy was originally applied in the Linux development community. Following the publication of the essay “The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary,” by Eric S. Raymond (Beijing and Cambridge, MA, 1999); online: http://www.catb.org/~esr/writings/cathedral-bazaar/, this policy became increasingly popular among digital humanities scholars and has been adapted to publication strategies not only for tool devel- opment but also for the creation of digital scholarly editions (“progressive editions”) in order to create a tight feedback loop between the editor and expert scholars in their respective fields of research: see Gun- ther Vashold, “Progressive Editionen als multidimensionale Informationsräume,” in Digital Diplomatics: The Computer as a Tool for the Diplomatist?, ed. Antonella Ambrosio, Sébastien Barret, and Georg Vogeler (Böhlau, 2014), 75–88; Andrew Dunning, “Rethinking the Publication of Premodern Sources: Petrus Plaoul on the Sentences,” RIDE (A review journal for digital editions and resources, published by the IDE [Institut für Dokumentologie und Editorik]) 3 (2015); doi:10.18716/ride.a.3.3, esp. pars. 5–7. 51 Possible negative effects of updating editions have been described by Gabriel Bodard, “The In- scriptions of Aphrodisias as Electronic Publication: A User’s Perspective and a Proposed Paradigm,” Digital Medievalist 4 (2008), doi:10.16995/dm.19, pars. 30–33. Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://www.catb.org/~esr/writings/cathedral-bazaar/ http://dx.doi.org/10.18716/ride.a.3.3 http://dx.doi.org/10.16995/dm.19 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.7788%2Fboehlau.9783412217020.75&citationId=p_n_83 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.7788%2Fboehlau.9783412217020.75&citationId=p_n_83 https://www.journals.uchicago.edu/action/showLinks?doi=10.1086%2F693823&crossref=10.16995%2Fdm.19&citationId=p_n_86 Digital Corpora and Scholarly Editions of Latin Texts S287 of the critical edition is clearly addressable, according to a URN-naming conven- tion as specified, for instance, by the Canonical Text Services (CTS) and used by the Perseus project and the Homer Multitext project;52 or by something similar to the Documents, Entities, and Texts (DET) system as recently presented by Peter Robinson in his widely discussed draft article on academia.edu.53 (5) No matter how user-friendly the interface of an edition or corpus may be, user scenarios and research questions cannot be anticipated always and every- where. For this reason, it is imperative to provide as much raw data and material as possible via interfaces (APIs) and downloads in order to enable scholars to access and collect the data directly. The editorial framework should allow for an import of various formats (such as TEI/XML, plain text, docx, pdf, tiff, and jpg) specified by the editorial guidelines. Ingested text files would be converted into corpus-specific XML, ideally customized TEI, in order to be stored and provided in the same format as the files created within the framework directly. (6) In connection with downloads and APIs there is the question of copyright and licenses. Digital humanities scholars and open-knowledge activists commonly agree today that a Creative Commons Attribution ShareAlike (CC BY-SA) license is the best way to make sure the editor’s work is appropriately credited and to en- sure that the data is openly accessible and remains open data.54 Conclusion Creating a digital corpus of critical editions is a complex task. It involves a wide range of strategic decisions to harmonize the heterogeneity of digital scholarly edi- tions with the core feature of a corpus residing mainly in the homogeneity of the way the texts are prepared and presented. Several suggestions have been proposed to convey a maximum of textual criticism with a minimum of formal requirements in order to provide a suitable data model, a practical editing environment, and a maintainable publishing framework that is attractive to both critical editors and scholarly users. A technical and institutional framework for integrating and explor- ing critical editions on a large scale is a great desideratum. It also seems to be a pos- sibility worth the effort to attain. 52 For Canonical Text Services (CTS), see the information at Sourceforge: http://cts3.sourceforge.net/; and, especially on CTS URNs, “The CITE Architecture Technology‐Independent, Machine-Actionable Ci- tation of Scholarly Resources”: http://cite-architecture.github.io/ctsurn/. 53 The article is soon to be published in Digital Humanities Quarterly: see Peter Robinson, “Some Prin- ciples for the Making of Collaborative Scholarly Editions in Digital Form”; a draft is on academia.edu at https://www.academia.edu/12297061/Some_principles_for_the_making_of_collaborative_scholarly _editions_in_digital_form; see here esp. 7–10 (with n. 11). 54 Material published under a CC BY-SA license can be copied and redistributed in any format and adapted for any purpose, even commercially, as long as the original creator is appropriately credited and the adapted material is distributed under the same license as the original; see https://creativecommons .org/licenses/by-sa/4.0/. Franz Fischer, University of Cologne (franz.fischer@uni-koeln.de) Speculum 92/S1 (October 2017) This content downloaded from 134.095.065.211 on July 02, 2018 06:40:17 AM All use subject to University of Chicago Press Terms and Conditions (http://www.journals.uchicago.edu/t-and-c). http://cts3.sourceforge.net/ http://cite-architecture.github.io/ctsurn/ https://www.academia.edu/12297061/Some_principles_for_the_making_of_collaborative_scholarly_editions_in_digital_form https://www.academia.edu/12297061/Some_principles_for_the_making_of_collaborative_scholarly_editions_in_digital_form https://creativecommons.org/licenses/by-sa/4.0/ https://creativecommons.org/licenses/by-sa/4.0/