Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 50   http://jrmdc.com       Some Initial Reflections on XML Markup for an Image-Based Electronic Edition of the Brooklyn Museum Aramaic Papyri F. W. Dobbs-Allsopp, Princeton Theological Seminary Contact: chip.dobbs-allsopp@ptsem.edu Chris Hooker, Princeton Theological Seminary Contact: christopher.hooker@ptsem.edu Gregory Murray, Princeton Theological Seminary Contact: gregory.murray@ptsem.edu Keywords Aramaic; Brooklyn Museum; critical edition; Elephantine; markup; papyrus; TEI; XML Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 51 Abstract: A collaborative project of the Brooklyn Museum and a number of allied institutions, including Princeton Theological Seminary and West Semitic Research, the Digital Brooklyn Museum Aramaic Papyri (DBMAP) is to be both an image-based electronic facsimile edition of the important collection of Aramaic papyri from Elephantine housed at the Brooklyn Museum and an archival resource to support ongoing research on these papyri and the public dissemination of knowledge about them. In the process of building out a (partial) prototype of the edition, to serve as a proof of concept, we have discovered little field-specific discussion that might guide our markup decisions. Consequently, here our chief ambition is to initiate such a conversation. After a brief overview of DBMAP, we offer some initial reflection on and assessment of XML markup schemes specifically for Semitic texts from the ancient Near East that comply with TEI, CSE, and MEP guidelines. We take as our example BMAP 3 (=TAD B3.4) and we focus on markup as pertains to the editorial transcription of this documentary text and to the linguistic analysis of the text’s language About the Authors: F. W. “Chip” Dobbs-Allsopp is Professor of Old Testament at Princeton Theological Seminary. His research interests include the historical, philological, and literary study of biblical and ancient Near Eastern literature (with special focus on poetry, Northwest Semitic inscriptions) and exploring how new technologies can enhance the editing of ancient Semitic texts. Dobbs- Allsopp’s most recent monograph is On Biblical Poetry (New York/Oxford: Oxford University Press, 2015). Christopher Hooker is a PhD candidate at Princeton Theological Seminary. Gregory Murray is Director of Academic Technology and Digital Scholarship Services at Princeton Theological Seminary Library. He has worked with TEI encoding of humanities texts since 1997 (TEI P3 in SGML) and has extensive experience with text processing and XML technologies, including XSLT and XQuery. Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 52 To Cite This Article: Dobbs-Alsopp, F.W., C. Hooker and G. Murray, 2016. Some Initial Reflections on XML Markup for an Image-Based Electronic Edition of the Brooklyn Museum Aramaic Papyri. Journal of Religion, Media and Digital Culture 5(1), pp. 50-72. Online. Available at: . Introduction: Project Overviewi A collaborative project of the Brooklyn Museum, Princeton Theological Seminary, and West Semitic Research, the Digital Brooklyn Museum Aramaic Papyri (DBMAP) is to be both an image-based electronic scholarly edition of the important collection of Aramaic papyri from Elephantine housed at the Brooklyn Museum and an archival resource to support ongoing research on these papyri and the public dissemination of knowledge about them. The collection, consisting of nine whole papyrus rolls (eight of which were still intact, folded, with original cords and sealings upon acquisition) and a large number of fragments (from more than eight other rolls), was bequeathed to the Brooklyn Museum by Ms. Theodora Wilbour in 1947. Ms. Wilbour’s father, Charles Edwin Wilbour, had purchased these papyri originally sometime during the period 26 January - 13 February 1893 in Aswan (this according to a notebook entry of his from that time). The papyri were packed in “tin biscuit boxes” and placed in a trunk with other boxes of Egyptian papyri, where they remained (ultimately stored in a New York warehouse) unknown and unread for over half a century—Wilbour died in 1896, without having revealed to anyone the contents of his purchase. As a result of Ms. Wilbour’s bequest, an editio princeps of these papyri was published in short order by Emil G. Kraeling—The Brooklyn Museum Aramaic Papyri: New Documents of the Fifth Century B.C. from the Jewish Colony at Elephantine (= BMAP; Kraeling 1953)—some sixty years after their initial discovery. The papyri date to the fifth century BCE and mostly consist of legal documents having to do with the interrelations of two families spanning several generations. Historically, the collection represents the earliest major acquisition of Aramaic papyri related to the ancient military colony at Elephantine. Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 53 At the heart of the proposed project is the creation of an image-based, facsimile edition of these spectacular Aramaic papyri. As with all scholarly text editions, whether print-based or digital, the chief task is to provide a reliable and accurate simulation of the underlying source text. MLA’s Committee on Scholarly Editing guidelines (= CSE) assert that editors establish reliability by “explicitness and consistency with respect to methods, accuracy with respect to texts, adequacy and appropriateness with respect to documenting editorial principles and practice” (MLA 2011). A primary rationale for undertaking a new critical edition of any text is to improve on what older editions have achieved. In the case of these papyri there is currently only a single critical (i.e., answering to CSE guidelines) edition available, the editio princeps published by Kraeling now almost sixty years ago. The volume in all respects is typical of a standard print-based critical edition, consisting of a long, informative “Historical Introduction” (Kraeling 1953, p. 3-119) and editions of each of the papyri—general description, transcription, English translation, critical commentary. There are also several indices (proper names, words) and a set of photographic plates—one black and white image for each papyrus plus an assortment of other images (e.g., endorsements, unopened papyrus rolls). This edition remains the single most comprehensive treatment of these papyri as a whole, although, of course, scholarship over the last sixty years has greatly improved our understanding of almost every aspect of these papyri. Which is to say, BMAP, no matter its historical contributions, can no longer serve as a fully adequate and accurate edition of these texts. In fact, most contemporary students of these papyri use the edition of these papyri found in the handbook edition of the entire corpus of Aramaic documents from Egypt by Bezalel Porten and Ada Yardeni (= TAD; Porten and Yardeni 1986-1989). This latter volume features the insights and readings of the foremost student of the Elephantine Aramaic corpus, Porten, and the exquisite hand drawings of Yardeni. It offers the most accurate rendition of the Brooklyn Museum papyri generally available. But as the title suggests, TAD was never envisioned as a critical edition of these texts (e.g., there is no commentary, explicit theory of editing or photographs of the texts and only a very minimal critical apparatus). It is our intention, then, to author a truly critical edition of these papyri, one that aspires to the traditional goals (and Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 54 standards) of scholarly editing and that builds on the vast scholarly advances achieved during the period between the appearances of BMAP and TAD and since. Reflections on XML Markup In the process of building out a (partial) prototype of the edition to serve as a proof of concept, we have discovered little field-specific discussion that might guide our markup decisions. Consequently, here our chief ambition is to initiate such a conversation. We offer some initial reflection and assessment of XML markup schemes specifically for Semitic texts from the ancient Near East that comply with TEI (Text Encoding Initiative n.d.), CSE, and the Model Editions Partnership (MEP) guidelines. We take as our example BMAP 3 (=TAD B3.4) and we focus on markup as pertains to the editorial transcription of this documentary text and to the morphosyntactic markup (part-of-speech tagging) of the text’s language.ii Editorial Transcription Transcription may be defined as “the effort to report—insofar as typography allows—precisely what the textual inscription of a manuscript consists of” (Meulen and Tanselle 1999, p.201). Our transcription is of a documentary text (i.e., non-literary) in a single copy and is designed to support a facsimile edition, not to stand primarily in its place.iii As such, our transcription is what has been traditionally described as a “typographic facsimile,” which “attempts to duplicate exactly the appearance of the original source text as far as possible within the limits of modern typesetting technology” (Kline and Perdue 2008, p. 147; cf. Meulen and Tanselle 1999, p. 201- 3).iv Where we believe explicit editorial comment is warranted (e.g., with regard to scribal alterations or where a reading is graphically ambiguous), users will be pointed to an epigraphic commentary for elaboration and discussion and will always be able to compare the transcription with the digital facsimile. A chief editorial aim in our transcription, then, is to report “what actually appears in a manuscript” as faithfully as possible (Meulen and Tanselle 1999: 203). Within the limits of folio technology, this has usually required a conscious editorial decision to forego the incorporation of Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 55 editorial emendations, for example in order to correct apparent errors. As Meulen and Tanselle (as late as 1999) state, “a text cannot simultaneously be unemended and emended” (1999: 203). In a digital environment this is no longer the case. The element in the General TEI Guidelines (3.4) specifically enables the encoder to represent for example a text in its ‘original’ uncorrected and unaltered form, alongside the same text in one or more ‘edited’ forms. This usage permits software to switch automatically between one ‘view’ of a text and another, so that (for example) a stylesheet may be set to display either the text in its original form or after the application of editorial interventions of particular kinds. This provides us with the very attractive opportunity to both present the textual artifact as it has been historically preserved and to register editorial interventions where we deem them desirable. Our interventions remain minimal, restricted (at this point) to apparent errors. The (inter-linear) writing of šhdy without the final aleph in l. 23 in the formulaic phrase šhdyʾ bgw “the witnesses herein are” (BMAP 2.14-15, 4.23, 5.16, 7.43, 8. 23, 10.18, 11.13; cf. 1.10) is a case in point: (1) šhdy šhdyʾ This makes good editorial (and historical) sense. It also means that when we add the morphosyntactic markup we are not left only with the erroneous analysis—in this case, as if the Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 56 noun were actually a plural construct. This leads to a general observation, namely, that because of the digital environment the various aspects of our critical edition (e.g., facsimile, transcription, translation) and archival resource (e.g., morphosyntactic analysis) need not fully overlap or seamlessly agree. The various components can be manipulated to multiple (and even conflicting) ends. The transcription will be offered in two scripts: a transliterated roman script and an Aramaic block script. All the XML markup was composed initially using the roman script. The markup for the transcription in the Aramaic block script (and also for the digital facsimile itself) will be generated from this same file. After some experimentation and following MEP recommendations (Chesnutt, Hockey and McQueen 1999), we follow a gradual markup procedure, attending to one variety (or level) of markup at a time (e.g., editorial transcription, morphosyntactic analysis). The basic elements in our markup scheme are three: the alpha- numeric characters themselves and indications for line and word division: (2) b 3 3 1 l ʾlwl hw These correspond to the characters of the Aramaic script and numerical notation system and the two meta-script conventions (line division, word spacing) habitually employed by the scribe (Haggai b. Shemaiah): (3) BMAP 3.1-3 Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 57 The writing consists of alphabetic characters and numerical ciphers grouped by word spacing and organized into horizontal lines. We employ the standard transliteration conventions formulated by the SBL (Alexander 1999, §5.1.1.1) for representing the Aramaic script, a linear alphabetic script: (4) ʾbgdhwzḥṭyklmnʿspršt In the Aramaic numeral notation system, for numbers up to ninety-nine, the system is purely cumulative-additive, consisting of signs for 20, 10, and 1. The unit-signs are grouped in threes (since up to nine such signs could be required). We use the corresponding Arabic numerals for the three component signs (20, 10, and 1), plus 2 and 3, depending on how the units are grouped. For example, the number twenty-eight is written out in l. 1 with a cipher composed of the following signs: (5) BMAP 3.1 20 3 3 3 2 20 3 3 2 The lines are right-adjusted and line-ends always coincide with graphic word boundaries. These lines are presentational in nature only, and thus bear no semantic significance for what is written. That is, the text is written in a running format with lines ending where they may, constrained only by the width of the sheet of papyrus being used and coincidence of word boundary. We use the “anonymous block” element () in TEI with the @type ("line") Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 58 and @n (e.g., "BMAP.3.R:01") attributes to signify these lines: (6) ... ... ... As generally in alphabetic writing from the ancient Levant, the Aramaic of the Elephantine papyri is written out with word division. Spacing (a brief segment of the papyrus left uninscribed) is used in these papyri to signify (graphic) word division, a convention of Aramaic scribal practice that becomes prominent in the seventh century BCE (e.g., KAI 233 = Assur Ostracon; TAD A1.1). We use whitespace wrapped in the “punctuation character” element ( ) to represent this meta-script convention. This turns out to have a number of benefits. First, it underscores the fact that such a use of spacing is a material, meta-script convention (e.g., like the use of commas). The element according to the TEI guidelines “contains a character or string of characters regarded as constituting a single punctuation mark” (17.1.2). In this instance, spacing is used just like the point or dot in the old Hebrew script or the small cuneiform wedge in the alphabetic cuneiform from ancient Ugarit, and stands in contrast to the continua scripta tradition of alphabetic writing without word dividers, as in some Phoenician scripts and in ancient Greek manuscripts. Second, it allows a more perspicuous linguistic description in the coding since a graphic word does not necessarily correspond to a linguistic (or grammatical) word. For example, prepositional phrases with the proclitic prepositions b-, l-, and k- are written out graphically together with their objects, e.g., lʿnnyh (l + ʿnnyh) “to Ananiah” (3.3), bʾbny (b + ʾbny) “in the stone weights” (3.6). Thus, the “word” ( ) element may be reserved for representing a “grammatical (not necessarily orthographic) word” (17.1.1). This use of the element also allows the use of whitespace within the XML markup for human readability—that is, only whitespace wrapped in a element indicates a character from the text, while all other whitespace is insignificant. Alterations observed in the source text are mainly of two kinds, additions and deletions, for which the “addition” () and “deletion” () elements from the core Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 59 TEI Guidelines (3.4.3; cf. 1.3.1.4) are used. With the element, the @hand (e.g., "scribe," "witness 1") and @place (e.g., "above," "inline") attributes are used, and with the element, the @type (e.g., "erasure") attribute. Examples: (7) ʾlhʾ (3.10) (the scribe added the word ʾlhʾ “the god” above the line) (8) 3 1 1 (3.6) (the scribe originally wrote the cipher for the number “5”; by erasing the last vertical stroke, he corrected the number to “4”) Additions, which (in this document) are generally inclusions of material accidentally left out initially and written in inter-linearly above the line, are marked approximately at the point in the text where they are inserted (often above spacing between words; cf. Meulen and Tanselle 1999, p. 205). Deletions, which are mostly erasures, are marked at the inter-linear point at which they occur. The markup aims only to report the fact of alteration. All additional editorial comment, including specification of chronological sequence, is reserved for the epigraphic commentary. The presence of the accompanying facsimile offers a helpful clarifying aid, relieving the markup of the need to be overly precise as to point of execution. On occasion it is apparent that a deletion and addition have been coordinated. For example, at the beginning of 3.3 the scribe initially wrote kl nšn 2, “altogether, 2 ladies.” Upon recognizing his mistake (the sellers are husband and wife) he erased the final vertical stroke in the cipher for the numeral “2” (converting it to the cipher for the numeral “1”) and added gbr 1 “1 man” inter-linearly above the line following the word kl “all.” (9) kl Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 60 gbr 1 nšn 1 1 There is no way of knowing the precise sequence in which these alterations were executed (i.e., addition then erasure, erasure then addition), but it is at least clear that they are interdependent. In such cases, the TEI Guidelines allow for the use of the “substitution” () element (11.3.3.1.5) to group coordinated alterations. However, since in some cases (as here) the alterations are not proximate, we forego the use of this element, marking only the fact of an addition and deletion and relying on the epigraphic commentary to detail a more precise characterization of the coordination (or of other relevant matters). There are places where a material reading is unclear for some reason. For example, in 3.2 the wife’s name was originally written as wbl. Later the scribe adds an additional letter super- linearly, above and to the left of the bet: (10) Kraeling construes the letter as an aleph and reads ʾwbl (BMAP, pp. 158-59). In contrast, Porten and Yardeni construe the letter as a yod and read wbyl (TAD B, p. 64). The name is apparently Hurrian (Kornfeld 1978: 113) and is spelled three other ways: ʾwbl (3.10), ybl (3.25), and ʾwbyl Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 61 (BMAP 4.3). Graphically, the inserted letter patterns more like a yod than an aleph, especially in over all size, and its placement (above the line after the bet) is also consistent with the yod in ʾwbyl (BMAP 4.3). If the scribe intended an aleph as the initial consonant in the name, presumably he would have inserted the letter above the line and closer to the beginning of the name, for which there is plenty of space (most of the super-linear additions in this document are inserted beginning approximately at the point in the text where they would fall most naturally, e.g., gbr 1 in 3.3; ʾlhʾ in 3.10; zk in 3.12). (11) gbr 1 in 3.3 (12) ʾlhʾ in 3.10 (13) zk in 3.12 In such cases, the TEI Guidelines provide for multiple ways of marking such variation. We have opted to use the “apparatus entry” () element, which may be used “whether or not represented by a critical apparatus in the source text,” with the parallel segmentation method for coding variant readings (12.2.3), because it provides maximum transparency and flexibility. When there is an editorial preference for a reading (as here) we mark that with the “lemma” () element (with the @resp attribute signaling any supporting opinions, e.g., "TAD"). Other readings are marked with the “reading” () element and the @resp attribute (e.g., “K”). So our markup for this example is as follows: Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 62 (14) wb y ʾ If the alternative readings are judged to be equally preferable each is marked with the element (with @resp attribute), as (possibly)v in the following example (3.24): (15) BMAP 3.14 ḥyḥ ḥyrw Again, the main intent is reportorial in nature. Any supporting rationale will be given in the epigraphic commentary. Morphosyntactic Markup Classification of words into parts of speech (or word classes) is not entirely straightforward. The modern linguistic practice has been to use morphological and syntactic criteria for defining parts of speech, which of course vary cross-linguistically and do not necessarily totally overlap even within languages. The appendix contains our working POS classification. The intention here is Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 63 not to innovate linguistically. We have attempted to use an intuitive, field-specific sense of the relevant grammatical and lexical categories in use. In general, our default analysis is cued principally by the treatments in standard grammars (e.g., GEA (Muraoka and Porten 1998)) and lexicons (e.g., CAL (Kaufman et al., n.d.), DNWSI) of the various Aramaic dialects. The chief aims in providing such tagging is to ease usability of these documents and to support the linguistic analysis of their language. Furthermore, wanting all markup to be well-formed XML, and thus enabling general portability and use of standard XML parsers for processing and the like, we have utilized only five general TEI elements: (16) (name, proper noun) contains a proper noun or noun phrase (number) contains a number, written in any form (word) represents a grammatical (not necessarily orthographic) word (morpheme) represents a grammatical morpheme (abbreviation) contains an abbreviation of any sort Four of the five elements—, , , and —are used fairly restrictively. The element, in contrast, does the bulk of the descriptive work.vi Several observations about the markup itself. First, the and elements provided by TEI accord well with the fact that proper names and numbers are generally distinguished lexicographically in Semitic (and Aramaic in particular) from other word categories. Numerals are mostly indicated through ciphers in this document, which we indicate with the @type (="cipher") and @value (e.g., "20") attributes. When a number is spelled out, as in 3.16 (lʿšrtʾ "to the ten"), we indicate what kind of number with the @type attribute (e.g., "cardinal") and then wrap it within the element: (17) ʿšrt Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 64 We do something similar with the element. Since we are not interested necessarily in expanding abbreviations within the transcription but pointing to a lexical entry, we identify an abbreviation with the element and then wrap in the element: (18) r (3.6) The element, the workhorse of this markup scheme, may appear with as many as three attributes: the @type attribute identifies the relevant part of speech; the @subtype attribute provides pertinent inflectional information (e.g., for nouns: gender, number and state; for verbs: binyan, TAM, person, gender, and number); and the @lemma attribute points to the citation form in a lexicon (module). In the case of homographs, we have followed the ordering found in CAL. As a rule, the attributes are used only as relevant (e.g., prepositions, conjunctions and the like require no @subtype entry) and only to the extent relevant (e.g., the @subtype description of nouns with possessive suffixes are marked only for gender and number).vii Initially, we have erred on the side of providing more descriptive POS categories, especially when it comes to the various kinds of particles, conjunctions, adverbs, and the like that are used. We treat clitics differently, depending on their kind. Clitics are (phonologically) bound forms (“constrained to occurring next to an autonomous word” (Hopper and Traugott 1993, p. 5)) that have an independent syntactic role and thus may be thought of as standing halfway between autonomous words and fully grammaticalized affixes. Prepositions and pronouns are two word categories that often become cliticized in natural languages. We have marked the proclitic prepositions (b-, l-, k-) and conjunctive waw (w-) with the element: (19) b (3.1) l (3.1) k (3.23) w (3.23) Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 65 This is consistent with the general treatment these elements receive from lexicographers, who habitually provide lexical entries for them in the standard lexicons. By contrast, the pronominal suffixes attached to verbs, nouns, and prepositions have been marked with the element: (20) bh (3.22) (21) ksp k (3.22) (22) ygrn k (3.19) The logic here is twofold: one, pronominal suffixes are not treated separately lexicographically in Aramaic (and in West Semitic generally), and, two, they are not considered a part of the standard inflectional feature set for verbs and nouns.viii Marking them with the element (instead of with the element) signals both of these distinctions and captures as well these clitics' strong resemblance to other affixes (e.g., suffixes on the Perfect), their lexical status notwithstanding.ix Another area where we default to the lexicographers (at least initially) is in our treatment of compound or pseudo prepositions (GEA, 87). For example, following CAL (and GEA) we consider both bšm (l. 13) and br mn (l. 21) as fully grammaticalized and thus autonomous lexical items and mark them as such: (23) br mn (3.21) Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 66 Alternative markup privileging the decomposition of these complex items are readily imaginable and perhaps could be handled alternatively using the element: (24) bšm bšm (25) br mn br mn We have used the traditional nomenclature for the various verbal binyanim (e.g. Peal, Pael, Afel) in the markup but will write a program that allows users to shift back and forth between this set of terms and the newer, cross-Semitic terms (e..g., G, D, C, as used in CAL). Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 67 Conclusion In closing, nothing about what we have just reviewed in terms of XML markup seems to us to be revolutionary, either technically or theoretically. The surprise remains the general absence of a scholarly discussion on such issues in the field. In part we suspect this is because most of the digital-based text projects in the field to date have been dominantly entrepreneurial in motivation and orientation and not conceived as research or scholarship. There are exceptions. For example, there seems to be a live interest currently in leveraging digital resources for syntactic analysis of various (Semitic) text corpora, and there are now a number of sites dedicated to presenting transcriptions of cuneiform literature (e.g., Sources for Early Akkadian Literature, http://www.seal.uni-leipzig.de/). But to our knowledge no digital-based project involving texts from the ancient Near East (esp. pre-Hellenistic corpora) have been conceived of from an explicitly articulated editorial perspective.x That is, most of the commonly used electronic text resources in the field (e.g., Accordance, Logos, Michigan-Claremont-Westminster Electronic Hebrew Bible) are essentially what is known as “reader editions.” They are not critical or scholarly editions and therefore, ultimately, cannot be depended on academically. These “reader” editions have served the field well, showing, for example, the viability and benefit of electronic text-based resources and “tools.” Now the field needs to take the next step: to create critical, scholarly editions that will make use of all of the advantages of the currently available electronic reader editions and also be trustworthy and reliable. This is what we are proposing to do with DBMAP. Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 68 Notes i This represents a slightly revised version of a paper presented in the Digital Humanities in Biblical, Early Jewish, and Christian Studies unit at the Annual Meeting of the Society of Biblical Literature in San Diego, CA (November 23, 2014). The images used in examples 3, 5, and 10-13 are details of BMAP 3 (47.218.95; =TAD B3.4). InscriptiFact Text ISF_TXT_00055. Photograph by Bruce and Kenneth Zuckerman, West Semitic Research. Courtesy Brooklyn Museum. Reuse of these images is prohibited without permission of the rights-holders. We thank Bruce Zuckerman and Marilyn Lundberg of West Semitic Research and Ed Bleiberg of the Brooklyn Museum for their support of this project more generally. ii In what follows, we employ inline markup. As one reviewer of this paper has pointed out, however, other methods of markup, such as stand-off markup (see, for example: http://www.tei-c.org/Activities/Workgroups/SO/sow06.xml; http://www.balisage.net/Proceedings/vol5/html/Banski01/BalisageVol5-Banski01.html), may actually end up being more congenial to our project. We find this an incredibly generative observation and plan to explore further such possibilities as the project moves forward. iii Kline and Perdue (2008, p. 147): “Increasingly common are print editions in which a photo facsimile appears as part of a parallel text accompanying a printed editorial transcription. Digital scanning creates wider options for editors who wish to offer such photographic images in online or DVD-based editions, conveniently linked to machine-searchable transcriptions, accessed through automated indexes.” iv Both BMAP and TAD (unwittingly) offer approximations of a “typographic facsimile,” although neither is consistent on this issue since these volumes are not expressly theorized from an editorial perspective. v This is by way of example only and follows the judgment of Porten and Yardeni (TAD B, 64)—we have not looked closely at personal names to this point. vi Here we emphasize the practical and limited nature of our initial experiment. There are other standards that are both compatible with TEI and promote the use and reuse of textual data across applications, e.g., LAF (Linguistic Annotation Framework, ISO 2012; see Ide and Romary 2004, p. 211-225). vii Historically, possessive suffixes were attached to nouns after the case endings in Aramaic, and syntactically, nouns with possessive suffixes are considered determined. Some synchronic grammars of specific Aramaic dialects (e.g., GEA, 46; Hug 1993, p. 56) indicate that the suffixes are attached to the construct forms of nouns. Whether this is the right analysis is open to debate, but even if correct, for our purposes, the presence of a possessive suffix implicates the use of the construct state of the noun, and therefore need not be explicitly marked (cf. Bar-Haim, Sima'an, and Winter 1998, p. 7). viii Contrast the suffixes on the Perfect form of the verb, which are clearly related historically to the larger pronominal system, with the chief difference that over time they became fully grammaticalized as suffixes, and thus a part of the verb's inflectional morphology. ix Contrast Bar-Haim, Sima'an, and Winter (1998, p. 7, 28), who treat pronominal suffixes on verbs and prepositions in Modern Hebrew as word segments, but not possessive suffixes on nouns. x For example, SEAL offers this as its main rationale: “to enable the efficient study of the entire early Akkadian literature in all its philological, literary, and historical aspects.” The site boasts of new “collations” for the texts presented, but offers no explicit editorial theory for guidance. Presumably this is to be elaborated in the print volumes under production. Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 69 Appendix: Part of Speech (POS) Inventory (name, proper noun) contains a proper noun or noun phrase @type="person" "divine" "place" "gentilic" (number) contains a number, written in any form @type="cipher" "cardinal" "ordinal" "fraction" "multiplicative" (abbreviation) contains an abbreviation of any sort (morpheme) represents a grammatical morpheme *mainly used (now) for representing object and possessive suffixes @type="sf-(person, gender, number)" (word) represents a grammatical (not necessarily @type="pos(sessive)" @lemma="(dictionary entry)" @type="indef(inite)" @lemma="(dictionary entry)" @type="prep(osition)" @lemma="(dictionary entry)" @type="conj(unction)" @lemma="(dictionary entry)" @type="neg(ative)" @lemma="(dictionary entry)" @type="cond(itional)" @lemma="(dictionary entry)" @type="inter(rogative)" @lemma="(dictionary entry)" @type="adverb" @lemma="(dictionary entry)" @type="interj(ection)" Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 70 orthographic) word @type="verb" @subtype="(binyan: pe, pa, af/haf, ethpe, ethpa, ettaf)-(TAM: pf, impf, impv, inf, part)-(person, gender, number)" @lemma="(dictionary entry)" @type="noun" @subtype="(gender, number)-(state: abs, cstr, det)" @lemma="(dictionary entry)" @type="adj(ective)" @subtype="(gender, number)-(state: abs, cstr, det)" @lemma="(dictionary entry)" @type="pron(oun)" @subtype="_(person, gender, number)" @lemma="(dictionary entry)" @lemma="(dictionary entry)" @type="exist(ence)" @lemma="(dictionary entry)" @type="part(icle)" @lemma="(dictionary entry)" @type="abbr(eviation" @lemma="(dictionary entry)" Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 71 Bibliography Alexander, P. H., ed., 1999. The SBL Handbook of Style: For Ancient Near Eastern, Biblical, and Early Christian Studies. Peabody: Hendrickson. Bar-Haim, R., K. Sima'An, and Y. Winter, 2008. Part-of-Speech Tagging of Modern Hebrew Text. Natural Language Engineering 14(2), pp. 223-251. Chesnutt, D. R., S. M. Hockey, and C. M. Sperberg-McQueen, 1999. Markup Guidelines for Documentary Editions. 4 July. [online] Available at: http://xml.coverpages.org/MepGuide199909.html [Accessed 19 December 2014] Donner, H. and W. Rölling, 1966-1969 and 2001. Kanaanäische une aramäische Inschriften. 2d ed and 5th ed. 3 vols. Wiesbaden: Harrasowitz. (=KAI) Hoftijzer, J. and K. Jongeling, 1995. The Dictionary of North-West Semitic Inscriptions. 2 vols. Leiden: Brill. (=DNWSI) Hopper, P. J., and E. C. Traugott, 1993. Grammaticalization. Cambridge: Cambridge University Press. Hug, V., 1993. Altaramäische Grammatik der Texte des 7. und 6. Jh.s v. Chr. Heidelberg: Heidelberger Orientverlag. Ide, N., and L. Romary, 2004. International Standard for a Linguistic Annotation Framework. Natural Language Engineering 10 (3-4), pp. 211-225. ISO, 2012. Language Resource Management – Linguistic Annotation Framework. ISO 24612:2012. Edition 1. [online] Available at: Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media and Digital Culture Volume 5, Issue 1 (2016) https://jrmdc.com 72 http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=37326 [Accessed 19 December 2014] (= LAF) Kaufman, S., et al. n.d. Comprehensive Aramaic Lexicon. Cincinnati: Hebrew Union College. [online] Available at: http://cal1.cn.huc.edu/. [Accessed 19 December 2014] (= CAL) Kline, M.-J., and S. H. Perdue. 2008. A Guide to Documentary Editing. 3d ed. Charlottesville: University of Virginia. [online] Available at: http://gde.upress.virginia.edu/ [Accessed 19 December 2014] Kraeling, E. G., 1953. The Brooklyn Museum Aramaic Papyri: New Documents of the Fifth Century B.C. from the Jewish Colony at Elephantine. New Haven: Yale University Press. (= BMAP) Meulen, D. L. V., and G. T. Tanselle, 1999. A System of Manuscript Transcription. Studies in Bibliography 52, pp. 201-212. MLA's Committee on Scholarly Editions, 2001. Guidelines for Editors of Scholarly Editions. MLA.org. [online] Available at: http://www.mla.org/cse_guidelines [Accessed 19 December 2014] (= CSE) Muraoka, T., and B. Porten, 1998. A Grammar of Egyptian Aramaic. Leiden: Brill (= GEA) Porten, B., and A. Yardeni., 1986-99. Textbook of Aramaic Documents from Ancient Egypt. 4 vols. Winona Lake: Eisenbrauns. (=TAD) Text Encoding Initiative, n.d. P5: Guidelines for Electronic Text Encoding and Interchange. [online] Available at: http://www.tei-c.org/release/doc/tei-p5-doc/en/html/index.html [Accessed 19 December 2014] (= TEI) Downloaded from Brill.com04/06/2021 12:40:09AM via free access Journal of Religion, Media & Digital Culture (JRMDC) Some Initial Reflections on XML Markup for an Image-Based Electronic Edition of the Brooklyn Museum Aramaic Papyri Abstract: About the Authors: To Cite This Article: Introduction: Project Overviewi Reflections on XML Markup Editorial Transcription Morphosyntactic Markup Conclusion Notes Appendix: Bibliography