Cronfa - Swansea University Open Access Repository _____________________________________________________________ This is an author produced version of a paper published in : Digital Scholarship in the Humanities Cronfa URL for this paper: http://cronfa.swan.ac.uk/Record/cronfa27244 _____________________________________________________________ Paper: Cheesman, T., Flanagan, K., Thiel, S., Rybicki, J., Laramee, R., Hope, J. & Roos, A. (2016). Multi-Retranslation Corpora: Visibility, Variation, Value and Virtue. Digital Scholarship in the Humanities http://dx.doi.org/10.1093/llc/fqw027 _____________________________________________________________ This article is brought to you by Swansea University. Any person downloading material is agreeing to abide by the terms of the repository licence. Authors are personally responsible for adhering to publisher restrictions or conditions. When uploading content they are required to comply with their publisher agreement and the SHERPA RoMEO database to judge whether or not it is copyright safe to add this version of the paper to this repository. http://www.swansea.ac.uk/iss/researchsupport/cronfa-support/ http://cronfa.swan.ac.uk/Record/cronfa27244 http://dx.doi.org/10.1093/llc/fqw027 http://www.swansea.ac.uk/iss/researchsupport/cronfa-support/ Multi-Retranslation Corpora: Visibility, Variation, Value, and Virtue ............................................................................................................................................................ Tom Cheesman Department of Languages, Swansea University, UK Kevin Flanagan Department of Languages, Swansea University, UK and SDL Research, Bristol Stephan Thiel Bauhaus University Weimar, Germany and Studio Nand, Berlin Jan Rybicki Institute of English Studies, Jagiellonian University, Krakow Robert S. Laramee Department of Computer Science, Swansea University, UK Jonathan Hope Department of English, Strathclyde University, UK Avraham Roos Amsterdam School of Culture and History, University of Amsterdam, the Netherlands ....................................................................................................................................... Abstract Variation among human translations is usually invisible, little understood, and under-valued. Previous statistical research finds that translations vary most where the source items are most semantically significant or express most ‘attitude’ (affect, evaluation, ideology). Understanding how and why translations vary is important for translator training and translation quality assessment, for cultural research, and for machine translation development. Our experimental project began with the intuition that quantitative variation in a corpus of historical retranslations might be used to project quasi-qualitative annotations onto the translated text. We present a web-based system which enables users to create parallel, segment-aligned multi-version corpora, and provides visual interfaces for exploring multiple translations, with their variation projected onto a base text. The system can support any corpus of variant versions. We report experi- ments using our tools (and stylometric analysis) to investigate a corpus of forty Correspondence: Tom Cheesman, Department of Languages, Swansea University, SA2 8PP, UK. E-mail: t.cheesman@swansea.ac.uk Digital Scholarship in the Humanities � The Author 2016. Published by Oxford University Press on behalf of EADH. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons. org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 1 of 22 doi:10.1093/llc/fqw027 Digital Scholarship in the Humanities Advance Access published August 23, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from German versions of a work by Shakespeare. Initial findings lead to more ques- tions than answers. ................................................................................................................................................................................. 1 Introduction Our project began with a simple observation and an intuition. The observation: in any set of multiple translations in a given language, variation among them varies through the course of the text. Some text units or chunks (at any level from word, say, up to chapter or character part in a play) are more variously translated than others. The intuition: this variation can be used to project an annotation onto the translated text, indicating where and how the extent of translation variation varies. This is the es- sence of our online system. It uses a ‘Translation Array’ (a parallel multi-translation corpus, aligned to a ‘base text’ of the translated work) to achieve ‘Version Variation Visualization’. Here, ‘version’ encompasses any text which can be at least partly aligned with others. But the website strapline is: ‘Explore great works with their world-wide translations’.1 If multiple translations of a work exist, then the work is enduringly popular and/or prestigious, ca- nonical or classic, in the translating culture: typically ‘great works’ of scripture, literature, philosophy, etc. 2 Interest in comparing such works’ multiple translations is surprisingly limited. Some large aligned retranslations corpora are publicly accessible online (works of scripture),3 but user access is lim- ited to two parallel texts, and no analytic tools are provided. No similar resources exist for any secular works at all, yet. This reflects the notorious ‘invisi- bility’ of translators and translations in general (Venuti, 2008). A key aim of our project is to make them visible. Retranslations are successive translations of the ‘same’ source work, often somehow dependent on precursor (re)translations. The source works con- cerned are mostly unstable texts in their original language: what translators translate varies and changes. And so does how they do it. The gamut runs from word-for-word renderings to very free adaptations or rewritings with little obvious relation to the source. Relay translation—via a third language—introduces further variation. If transla- tions are reprinted or otherwise re-used, they tend to be changed again. Venuti (2004) argues that re- translations (more than most translations) ‘create value’ in the target culture. 4 A first translation of a foreign work creates awareness of it. If retranslations follow, the work becomes assimilated to the target culture. If retranslations multiply, each both re- inforces the value and status of the work in the target culture, and extends the range of competing interpretations surrounding it. Retranslations there- fore throw up questions going well beyond linguistic and cultural transfer, concerning ‘the values and in- stitutions of the translating culture’, and how these are defended, challenged, or changed (Venuti, 2004, p. 106). Within Translation Studies, ‘retranslation stu- dies’ is underdeveloped, despite its fundamental im- portance for translation, linguistics, and communication, as well as comparative, trans- national cultural studies. As Munday (2012) argues, retranslations are important resources, be- cause no single utterance or text exists in isolation from alternative forms it might have taken. Any extant text is surrounded by a ‘penumbra’ of ‘un- selected forms’ (Munday, 2012, p. 13, citing Grant, 2007, pp. 183–4); so any translation is surrounded by ‘shadow translations’ (Johannson, 2011, p. 3, citing Matthiessen, 2001, p. 83). Sets of translations by different translators (or the same translators at different moments) make visible at least some otherwise unselected forms. This offers scope for studying ‘the value orientations that underlie these selections’ (Munday, 2012, p. 13). Our project seeks to go even further: from the how and why of vari- ation among translations, back to the varying cap- acity of the translated text to provoke variation. The article is organized as follows: Section 2 re- views related work, including statistical studies in translation variation. Section 3 presents our soft- ware project, covering our Aligner, Corpus Overviews (including stylometric analysis), and our key innovation: an interface deploying ‘Eddy T. Cheesman et al. 2 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from and Viv’ algorithms to explore translation variation. Section 4 presents findings of experiments using the software. Section 5 offers concluding comments. 2 Related Work There has been little digital work on larger retrans- lations corpora, involving works of wide intrinsic interest, and none designed to facilitate access to multiple translations, and the translated work, to- gether with algorithmic analyses. Jänicke et al. (2015) take in some ways a similar approach, but their ‘TRAViz’ interface offers a very different mode of text visualization, is monolingual (shows no translated text), and works best with more limited variation and shorter texts (see Section 3.3). Lapshinova-Koltunski (2013) describes a parallel multi-translation corpus designed to support com- putational linguistic analyses of differences between professional translations, student translations, Machine Translation (MT) outputs, and edited MT outputs. Shei and Pain (2002) proposed a simi- lar parallel corpus, with an interface designed for translator training. These projects only offer access to filtered segments of the text corpus, and do not envisage exploring variation among retranslations. Altintas, Can, and Patton (2007) used two time- separated (c.1950, c.2000) collections of published translations of the same seven English, French, or Russian literary classics into Turkish, to quantify aspects of language change. This raises the question whether such translations ‘represent’ their language. Corpus-based Translation Studies (Baker, 1993; Kruger et al., 2011) has established that translated language differs from untranslated language. We also know from decades of work in Descriptive Translation Studies (Morini, 2014; Toury, 2012) that retranslations vary for complex genre-, market-, subculture-specific and institutional fac- tors, and individual psychosocial factors, involving the translators and others with a hand in the work (commissioners, editors), and their uses of re- sources including source versions and prior (re)translations. There is no consensus on defining such factors and their interrelations. The conclusion of a manual analysis of eight English versions of Zola’s novel Nana is typically vague: (. . .) specific conditions (. . .) explain the similarities and differences (. . .). The condi- tions comprise broad social forces: changing ideologies and changing linguistic, literary, and translational norms; as well as more spe- cific situational conditions: the particular con- text of production and the translator’s preferences, idiosyncrasies, and choices. (Brownlie, 2006, p. 167) The basic lesson is that translation is a humanities subject. Translators are writers. As Baker warns: Identifying linguistic habits and stylistic pat- terns is not an end in itself: it is only worth- while if it tells us something about the cultural and ideological positioning of the translator, or of translators in general, or about the cog- nitive processes and mechanisms that contrib- ute to shaping our translational behaviour. We need then to think of the potential motiv- ation for the stylistic patterns that might emerge from this type of study. (Baker, 2000, p. 258) Her comment is cited by Li et al. (2011, p. 157), in their computationally assisted study of two English translations of Xueqin Cao’s Hongloumeng.5 They conclude: corpus-assisted translation research can go beyond proving the obvious or the already known as long as meta- or para-texts are available for the analysis. The extent and depth of such analysis of course depends on the amount of information available in the form of meta- or other texts. (Li et al., 2011, p. 164) Genuine understanding of cultural materials re- quires knowledge and critical understanding of many other materials, to assess how multi-scale human factors shape texts and the effects they have (had) in their cultural world. Non-digital studies in retranslation underline the importance of such shaping factors. Deane-Cox (2014) and O’Driscoll (2011) both recently Multi-retranslations Digital Scholarship in the Humanities, 2016 3 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from investigated large sets of English retranslations of 19th-century French novels. They detail at length the historical contexts of each retranslation, its pro- duction and reception, and analyse short samples linguistically or stylistically. Deane-Cox’s overall ar- gument disproves the ‘Retranslation Hypothesis’ put forward by Antoine Berman (1990, p. 1). Berman argued that over time, successive retransla- tions should tend to translate the source text more accurately. In fact—as we will see—this may hold for a first few retranslations, but when they multi- ply, the hypothesis no longer holds. This is partly because retranslators who come late in a series must be more inventive, to distinguish their work from that of precursors and rivals. The desire for distinc- tion is a great motivator (Mathijssen, 2007; Hanna, 2016). Critical translation studies pays close atten- tion to such specific contextual factors, viewing each translation as an act of intervention in a particular moment in a particular place in the geographical and social world, and a trace of a translator’s (and associated agents’) both conscious and unconscious choices (Munday, 2012, p. 20). As Munday argues, translation is essentially an evaluative act. Translator’s decisions are based on evaluations of the source text, of the implicit values of its author and intended audience, and of the expectations and values of the intended audience of the translation. 2.1 Statistical Studies Statistical studies of differences between translations confirm this perspective, and also rain on the MT parade. They show that variation is greatest both in the most semantically significant units of a text, and in the units which are most expressive of values and affect. Babych and Hartley (2004) measured the sta- bility of alternative translations at word and phrase level in English versions of 100 French news stories by two professional translators. They found a strong statistical correlation between instability and the scores of linguistic items in the source text for sali- ence (tf.idf score) or significance (S-score; see Babych et al., 2003). The more important an item is for a text’s meaning, the less translators tend to agree about translating it (though each one is con- sistent in using their selected terms). Babych and Hartley deduce that ‘highly significant units typically do not have ready translation solutions and require some ‘‘artistic creativity’’ on the part of translators’, and that this necessary ‘freedom’ makes translation fundamentally ‘‘‘non-comput- able’’ or ‘‘non-algorithmic’’’ (Babych and Hartley, 2004, p. 835, citing Penrose, 1989). They conclude that there are: fundamental limits on using data-driven approaches to MT, since the proper transla- tion for the most important units in a text may not be present in the corpus of available translations. Discovering the necessary trans- lation equivalent might involve a degree of inventiveness and genuine intelligence. (Babych and Hartley, 2004, p. 836) Munday (2012, pp. 131–54) studied seventeen English translations of an extract from a story by Jorge Luis Borges: two published translations and fifteen commissioned from advanced trainee trans- lators. Four in five lexical units varied. Invariance was associated with ‘simple, basic, experiential or denotational processes, participants and relations’ (p. 143). Variation mainly occurred in ‘lexical ex- pression of attitude’, i.e. affect/emotion, judgment/ ethics, appreciation, or evaluation (p. 24). Variation was greatest at ‘critical points’, where ‘attitude-rich’ words and phrases ‘carry the attitudinal burden of the text’ and communicate ‘the central axiological values of the protagonists, narrator or writer’ (p. 146)—again, in effect, the semantically most sig- nificant items. Translations vary most at points of greatest se- mantic and evaluative/attitudinal salience. MT has a long way to go, then. Its problems include identify- ing attitude, affect, or evaluation in a text to be translated. In a chapter on MT and pragmatics, Farwell and Helmreich (2015) discuss lexical and syntactic differences in 125 Spanish newswire art- icles translated into English by two professional translators: 40% of units differed, and 41% of dif- ferences could be attributed to the translators’ dif- ferent ‘assumptions about the world’ (rather than assumption-neutral paraphrasing, or error). One example is this headline: Acumulación de vı́veres por anuncios sı́smicos en Chile T. Cheesman et al. 4 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from Translation 1: Hoarding caused by earthquake predictions in Chile Translation 2: Stockpiling of provisions be- cause of predicted earthquakes in Chile (Farwell and Helmreich, 2015, p. 171) The translations make vastly different ideolo- gical, political, evaluative assumptions. ‘Hoarding’ suggests a panicky, irrational population, respond- ing to rumours of an unlikely event. ‘Stockpiling’ (by the population, or the civil authorities?) is a prudent response to credible (scientific experts’?) warnings. It is impossible—without ‘meta- or para-texts’—to disentangle whether the translators impute different values to the mind of the source text creator, or to its intended readers, or to the anticipated readers of the target text, and/or whether they express their own psychological and ideological values. ‘Acumulación’, here, has major evaluative implications which could not be pre- dicted without area-specific political and economic expertise. Perhaps a multi-retranslation corpus could be used to discover which items provoke vari- ation, as a proxy for such knowledge? If not, what would it discover? 3 Project Description A multi-retranslation corpus will contain versions of various kinds; complete, fragmentary, edited, adapted versions; versions derived from (a version of) the original-language translated work, or from intermediaries in the translating language, and/or other languages; versions in various media; for vari- ous audiences (popular, scholarly, restricted); in mono-, bi-, or plurilingual formats; from various periods and places; produced and received under various economic, political, institutional, and cul- tural-linguistic conditions. An obvious lay question is: Which one is best? But the problem is already clear: By what criteria, or whose, do we judge? Models for assessing professional translations (House, 1997) are predicated on full and precise rendering of the source, but work less well with cre- ative genres, where such ‘fidelity’ is often subordi- nated to effect in the target culture. Retranslations of poetry, plays, novels, religious, or philosophical works can be very successful (i.e. ‘good’, for many people) without being at all complete or accurate. A related question is: Why do most retranslations have brief lives (just one publication, or media or per- formance use), while others—backed by some insti- tutional authority—become canonical, and have many editions, revisions, and re-uses, over gener- ations? Does the answer lie in linguistic, textual qualities of the translation, measured in terms of its relation to the original work? Or in some quali- ties of it, measured in relation to alternative versions or other target culture corpora? Or does it lie solely in institutional factors? Our project does not comprehensively address these questions. It grew out of a particular piece of translation criticism, and the intuition that digital tools could be developed to explore patterns in vari- ation among multiple (re)translations, in themselves, in relation to target cultural contexts, and in relation to the translated work. Before knowing any of the above-mentioned studies, Cheesman wanted to find ways to compare a large collection of German trans- lations and adaptations of Shakespeare’s play, The Tragedy of Othello, The Moor of Venice (see corpus overviews in section 3.2 below).6 His interest was as a researcher in German and comparative literature and culture. He had worked on a recent, controversial version of Othello (Cheesman, 2010), and wondered how it related to others. He manually examined over thirty translations (1766–2010) of a very small sample: a fourteen-word rhyming couplet, a ‘critical moment’ which is rich in affect, evaluation, and am- biguity (Cheesman, 2011).7 His study showed how differences among the translations traced a 250-year- long conversation about human issues in the work— gender, race, class, political power, interpersonal power, and ethics. Could digital tools help to explore such questions and communicate their interest to a wider public? The couplet he had selected was clearly more variously translated than most passages in the play. So he wondered if we could devise an algo- rithmic analysis which would identify all the most variously translated passages, to steer further research. A proof-of-concept toolset (‘Translation Array Prototype’) was built, using as test data a corpus of thirty-eight hand-curated digital texts of Multi-retranslations Digital Scholarship in the Humanities, 2016 5 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from German translations and adaptations of part of the play: Othello, Act 1, Scene 3. This is about 3,400 continuous words of the play’s 28,000, in English: 392 lines and 92 speeches (in Neill’s 2006 edition). The restricted sample size was due to restricted re- sources for curating transcriptions, and translation copyright limitations. Versions were procured from libraries, second-hand book-sellers, and theatre publishers (who distribute texts not available through the book trade). Digital transcription stripped out original formatting and paratexts (pref- aces, notes, etc). The transcriptions were minimally annotated, marking up speech prefixes, speeches, and stage directions. The brief for the programmers (Flanagan and Thiel) was to build visual web inter- faces enabling the user to: align a set of versions with a base text and so create a parallel multi-version corpus;8 obtain overviews of corpus metadata and aligned text data; navigate parallel text displays; apply an algorithmic analysis to explore the differ- ing extent to which base text segments provoke vari- ation among translations; customize this analysis and create various forms of data output to support cultural analyses. 3.1 Aligner An electronic Shakespeare text was manually col- lated with a recent edition, to give us a base text inclusive of historic variants.9 Then we needed to align it segmentally with the versions. Existing open tools for working with text variants (e.g. Juxta col- lation software)10 lack necessary functionality; so do existing computer-assisted translation tools; per- haps such software could be adapted; at any rate we built a web-based tool from scratch. The devel- oper, Flanagan, explains its two main components: Ebla: stores documents, configuration details, segment and alignment information, calcu- lates variation statistics, and renders docu- ments with segment/variation information. Prism: provides a web-based interface for up- loading, segmenting and aligning documents, then visualizing document relationships. Areas of interest in a document are demar- cated using segments, which also can be nested or overlapped. Each segment can have an arbitrary number of attributes. For a play these might be ‘type’ (with values such as ‘Speech’, ‘Stage Direction’), or ‘Speaker’ (with values such as ‘Othello’, ‘Desdemona’), and so on. (Flanagan in: Cheesman et al., 2012) Hand- or machine-made attributes such as ‘irony’, ‘variant from source x’, ‘crux’, ‘body part y’, ‘affect z’, ‘syllogism’, ‘trochee’, and ‘enjambe- ment’ are equally possible. But all would require time-consuming tagging. In fact, we have worked only with ‘type: Speech’. Segment positions are stored as character offsets within documents, and texts can be edited without losing this information (transcription errors keep being discovered). Segmented documents are aligned in an interactive WYSIWYG tool, where an ‘auto-align’ function aligns all the next segments of specified attribute. For Othello, every speech prefix, speech and ‘other’ string is automatically pre-defined as a segment of that type. Any string of typographic characters in a speech can be manually defined as a segment and aligned. Thiel and colleagues at Studio Nand built visual interfaces on top of Prism, including parallel- text views tailor-made for dramatic texts (base text and any translation), and the ‘Eddy and Viv’ view discussed below (Section 4). Thiel (2014b) docu- ments the design process. He also sketched a scal- able, zoomable multi-parallel view of base text and all aligned versions, an overview model which re- mains to be developed as an interface for combined reading and analysis (Thiel, 2014a). 11 3.2 Corpus overviews Visual overviews of a corpus support distant read- ings of text and/or metadata features. We devised three. An online, interactive time-map of historical geography shows when and where versions were written and published (performances are a desider- atum); it identifies basic genres (published books for readers, books for students, theatre texts), and pro- vides bio-bibliographical information (Thiel, 2012). A stylometric diagram is discussed in Section 3.2.2 (Fig. 2). ‘Alignment maps’ depict the information created by segment alignment (Fig. 1). T. Cheesman et al. 6 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from F ig . 1 A li g n m en t m ap s o f th ir ty -f iv e G er m an O th el lo 1 .3 (1 7 6 6 – 2 0 1 0 ) Multi-retranslations Digital Scholarship in the Humanities, 2016 7 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from 3.2.1 Alignment maps Alignment maps, developed by Thiel, are ‘barcode’- type maps which show how a translation’s constitu- ent textual parts (here: speeches) align with a similar map of the base text. Figure 1 shows thirty-five such maps, in chronological sequence. Each left-hand block represents the English base text of Othello 1.3, the right-hand block represents a German text, and the connecting lines represent alignments in the system. Within each block, horizontal bars represent speeches (in sequence top to bottom) and thickness represents their length, measured in words; Othello’s longest speech in the scene (and the play) is highlighted. Small but significant differ- ences in overall length can be noticed: translations tend to be longer than the translated texts, so it is interesting to spot versions which are complete yet more concise, such as Gundolf (1909). We can see which versions, in which passages, make cuts, reduce, expand, transpose, or add material which could not be aligned with the base text. In the centre of the figure, the German translation (Felsenstein and Stueber, 1964) of the Italian li- bretto (by Boito) of Verdi’s opera Otello (1887) is a good example of omission, addition, and trans- position. Omissions and additions are also evident in the recent stage adaptations on the bottom line. Zimmer (2007), like Boito, assigns Othello’s long speech to multiple speakers. In our online system, these maps serve as navigational tools alongside the texts in Thiel’s parallel-text views. Each bar repre- senting a speech is also tagged with the relevant speech prefix, so any character’s part can be high- lighted and examined. Aligned segments are rapidly, smoothly synched in these interfaces, assisting ex- ploratory bilingual reading. 3.2.2 Stylometric network diagram Figure 2 depicts a stylometric analysis of relative Most Frequent Word frequencies in 7,000-word chunks of forty German versions of Othello, carried out by Rybicki using the Stylo script and the Gephi visualization tool.12 The network diagram shows (1) relations of general similarity between versions, rep- resented by relative proximity (clustering), and (2) similarities in particular sets of frequency counts, represented by connecting lines; their thickness or strength represents degree of similarity. These lines (edges) can indicate intertextual relations: depend- ency of some kind, including potential plagiarism. Directionality can be inferred from date labels on nodes. For example, the version by Bodenstedt (1867) (near top centre) was revised in the strongly connected version by Rüdiger (1983). This confirms data on his title page. Other results, as we will see, are more surprising: spurs to further research. The x/y axes are not meaningful. The analysis involves hundreds of counts using differing param- eters: the diagram is a design solution to the prob- lem of representing high-dimensional data in a two- dimensional plane. Removing or adding even one version produces a different layout and can re-ar- range clusters. Moreover, the analysis process is so complex that we cannot specify which text features lie behind the results. Broadly, though, the diagram can be read historically, right to left: a highly formal poetic theatre language gives way to increasingly in- formal, colloquial style. Nine versions are revisions, editions or rewritings of the canonical translation by Baudissin (originally 1832, in the famed ‘Schlegel-Tieck’ Shakespeare edi- tion; see: Sayer, 2015). Most are quite strongly con- nected and closely clustered, but the apparent stylometric variety is a surprise. The long, weak line connecting the cluster to the heavily revised stage adaptation by Engel (1939) (upper left) is to be expected, but the length and weakness of the connection with Wolff’s (1926) published edition (lower right) is more of a surprise. His title page indicated a modestly revised canonical text, but styl- ometry suggests something more radical is going on. 13 Above all, this analysis reveals the salience of his- torical period. Distinct clusters are formed by all the early C19 versions (mid-right), arguably all the late C19 versions (top), most of the late C20 versions (lower left), and all the C21 versions (far left). The C21 versions are all idiosyncratic adaptations (cf. Fig. 1, bottom line). It is surprising to see how simi- lar they appear, in stylometric terms, relative to the rest of the corpus. And what do the strong links among them indicate? Mutual influence, plagiarism, common external influence? What about the lines leading from Gundolf (1909) (low centre) across to T. Cheesman et al. 8 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from Swaczynna (1972), to Laube (1978), to Günther (1995)? Günther is the most celebrated living German Shakespeare translator: do these lines trace his debts to less famous precursors? Period outliers are also interesting. Zeynek (?-1948) ap- pears to be writing a C19 style in the 1940s. The unknown Schwarz (1941) is curiously close to the famous Fried (1972). Rothe (1956) (extreme bottom left) is writing in a late C20 style in the 1950s. This throws interesting new light on the notorious ‘Rothe case’ of the Weimar Republic and Nazi years: he was victimized for his ‘liberal’, ‘modern’ approach to translation (Von Ledebur, 2002). Genre is salient, too. A very distinct cluster, bottom right, includes all versions designed for study and written in prose (rather than verse). This includes our two earliest versions (1766 and 1779) and two published 200 years later (1976, 1985). Strongly interconnected, weakly connected with any other versions, this cluster demonstrates the flaw in the approach of Altintas et al. (2007). Differences in the use of German represented by distances across the rest of graph cannot be due to any general historical changes in the language. They reflect changes in the specific ways German is used by translators of Shakespeare for the stage, and/or for publications aimed at people who want to read his work for pleasure. 3.3 The ‘Eddy and Viv’ interface Overviews are invaluable, but the core of our system is a machine for examining differences at small scale. The machine implements an algorithm we called ‘Eddy’,14 to measure variation in a corpus of translations of small text segments. Eddy’s find- ings are then aggregated and projected onto the base text segments by the algorithm ‘Viv’ (‘variation in variation’). In an interface built by Thiel, on the basis of Flanagan’s work, users view the scrollable base text (Fig. 3: left column) and can select any previously defined and aligned segment: this calls up the translations of it, in a scrollable list (Fig. 3: Fig. 2 Stylometric analysis of forty German Othellos Node label key: Translator_Date. Prefix: Baud ¼ version of Baudissin (1832). Suffixes: _Pr ¼ prose study edition. No suffix ¼ other book. _T ¼ theatre text (no book trade distribution). _X ¼ theatre text, not performed (only version by a woman). Multi-retranslations Digital Scholarship in the Humanities, 2016 9 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from right columns). The list can be displayed in various sequences (transition between sequences is a pleas- ingly smooth visual effect) by selecting from a menu: order by date; by the translator’s surname; by length; or (as shown in Fig. 3) by Eddy’s algo- rithmic analysis of relative distinctiveness. Eddy metrics are displayed with the translations, and also represented by a yellow horizontal bar which is longer, the higher the relative value. We defined ‘segment’, by default, as a ‘natural’ chunk of dramatic text: an entire speech, in semi- automated alignment. Manual definition of seg- ments (any string within a speech) is possible, but defining and aligning such segments in forty ver- sions is time-consuming. In future work we intend to use the more standard definition: segment ¼ sentence (not that this would simplify alignment, since translation and source sentence divisions fre- quently do not match). Eddy compares the wording of each segment version with a corpus word list: here the corpus is the set of aligned segment ver- sions. No stop words are excluded; no stemming, lemmatization, or parsing is performed. Flanagan explains how the default Eddy algorithm works: Each word in the corpus word list [the set of unique words for all versions combined] is considered as representing an axis in N-di- mensional space, where N is the length of the corpus word list. For each version, a point is plotted within this space whose co- ordinates are given by the word frequencies in the version word list for that version. (Words not used in that version have a frequency of zero.) The position of a notional ‘average’ translation is established by finding the cen- troid of that set of points. An initial ‘Eddy’ variation value for each version is calculated by measuring the Euclidean distance between the point for that version and the centroid. Flanagan in Cheesman et al. (2012–13) This default Eddy algorithm is based on the vector space model for information retrieval. Given a set S of versions {a, b, c . . .} where each version is a set of tokens {t1, t2, t3 . . . tn}, we create a set U of unique tokens from all versions in S (i.e. a corpus word list). For each version in S we construct vectors of attributes A, B, C . . . where each attribute is the occurrence count within that version of the corresponding token in U, that is: A ¼ Xjaj j¼1 aj ¼ Ui � � Fig. 3 Eddy and Viv interface (Colour online) T. Cheesman et al. 10 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from We construct a further vector Z to represent the centroid of A, B, C . . . such that Z ¼ Ai þ Bi þ Ci . . .ð Þ jSj Then, for a version a, the default Eddy value is calculated as: Eddy ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi XjUj i¼1 jZi � Aij 2 vuut This default Eddy formula is used in the experi- ments reported below, coupled with a formula for Viv as the average (arithmetic mean) of Eddy values. Other versions of the formulae can be selected by users,15 e.g. an alternative Eddy value based on an- gular distance, calculated as: Eddy ¼ 2cos�1 A�ZIAIIZ I � � � Work remains to be done on testing the different algorithms, including the necessary normalization for variations in segment length.16 Essentially, Eddy assigns lower metrics to word- ings which are closer to the notional average, and higher metrics to more distant ones. So, Eddy ranks versions on a cline from low to high distinctiveness, or originality, or unpredictability. It sorts common- or-garden translations from interestingly different ones. Viv shows where translators most and least dis- agree, by aggregating Eddy values for versions of the base text segment, and projecting the result onto the base text segment. Viv metrics for segments are dis- played if the text is brushed, and relative values are shown by a colour annotation (floor and ceiling can be adjusted). As shown in Fig. 3, the base text is annotated with a colour underlay of varying tone. Lighter tone indicates relatively low Viv (average Eddy) for translations of that segment. Darker tone indicates higher Viv. Shakespeare’s text can now be read by the light of translations (Cheesman, 2015). Sometimes it is obvious why translators disagree more or less. In Fig. 3, Roderigo’s one-word speech ‘Iago -’ has a white underlay: every version is the same. The Duke’s couplet beginning ‘If virtue no delighted beauty lack. . .’ (the subject of Cheesman’s initial studies), has the darkest under- lay. As we knew, translators (and editors, per- formers, and critics) interpret this couplet in widely varying ways. In the screenshot, the Duke’s couplet has been selected by the user: part of the list of versions can be seen on the right. MTs back into English are provided, not that they are always helpful. Unlike the TRAViz system (Jänicke et al., 2015), ours does not represent differences between versions in terms of edit distances, and translation choices in terms of dehistoricized decision pathways. Our system preserves key cultural information (historical sequence). It can better represent very large sets of highly divergent versions. The TRAViz view of two lines from our Othello corpus (Jänicke et al., 2015, Figure 17) is a bewilderingly complex graph. With highly divergent versions of longer translation texts, TRAViz output is scarcely readable. Crucially there is no representation of the translated base text. The Eddy and Viv interface is (as yet) less adaptable to other tasks, but better suited to curiosity-driven cross-language exploration.17 4. Experiments with Eddy and Viv 4.1 Eddy and ‘Virtue? A fig!’ To illustrate Eddy’s working, Table 1 shows Eddy results, in simplified rank terms (‘high’, ‘low’, or unmarked intermediate), for thirty-two chrono- logically listed versions of a manually aligned seg- ment with a very high Viv value: ‘Virtue? A fig!’ (Othello 1.3.315). An exclamation is always, in Munday’s terms, ‘attitude-rich’, burdened with affect; this one is a ‘critical point’ for several reasons. ‘Virtue’ is a very significant term in the play, and crucially ambiguous: in Shakespeare’s time it meant not only ‘moral excellence’ but also ‘essential nature’, or ‘life force’, and ‘manliness’.18 The speaker here is Iago, responding to Roderigo, who has just declared that he cannot help loving the heroine, Desdemona: ‘. . . it is not in my virtue to amend it’. Roderigo means: not in my nature, my power over myself, my male strength. But Iago’s response implies the moral meaning, too. Then, Multi-retranslations Digital Scholarship in the Humanities, 2016 11 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from T a b le 1 ‘V ir tu e? A fi g !’ in th ir ty -t w o G er m an tr an sl at io n s (1 7 6 6 – 2 0 1 2 ) N u m b er s ‘E d d y ’ ra n k L en g th ra n k T ra n sl a ti o n s B a ck -t ra n sl a ti o n s S o u rc es In te rt ex ts 1 T u g en d ? P fi ff er li n g . V ir tu e? [N o t w o rt h a] ch an te re ll e. W ie la n d 1 7 6 6 (S ) (P ) 2 H T u g en d !— D en H en k er au ch ! V ir tu e! — [T o H el l w it h ] th e ex ec u ti o n er to o ! E sc h en b u rg (e d . E ck er t) 1 7 7 9 (S ) 3 L L T u g en d ? P o ss en ! V ir tu e? B u ff o o n er y ! S ch il le r an d V o ss 1 8 0 5 (P ) C f. # 4 , # 1 1 4 L T u g en d ? N ar re n sp o ss en ! V ir tu e? F o o ls ’ b u ff o o n er y ! B en d a 1 8 2 6 /O rt le p p 1 8 3 9 5 L T u g en d ! A b g es ch m ac k t! V ir tu e! V u lg ar ! � B au d is si n [S ch le g el -T ie ck ] 1 8 3 2 (P ) 6 T u g en d ? Z u m H en k er ! V ir tu e? T o th e ex ec u ti o n er ! [¼ b e d am n ed !] B o d en st ed t 1 8 6 7 7 T u g en d ? L ee re s G ef as el ! V ir tu e? M in d le ss d ri v el ! Jo rd an 1 8 6 7 8 T u g en d ? W is ch iw as ch i! V ir tu e? D ri v el ! G il d em ei st er 1 8 7 1 9 L L T u g en d ! A eh ! V ir tu e! U g h ! V is ch er 1 8 8 7 1 0 T u g en d ! P fe if d ra u f! V ir tu e! W h is tl e o n it ! [¼ D o n ’t g iv e a d am n fo r it ] G u n d o lf 1 9 0 9 1 1 L L T u g en d ! P o ss en ! V ir tu e! B u ff o o n er y ! B au d is si n (e d . W o lf f) 1 9 2 6 1 2 L T u g en d ! D u m m h ei t! V ir tu e! S tu p id it y ! E n g el 1 9 3 9 (T ) 1 3 E n er g ie ? E in S ch m ar re n ! E n er g y ? N o n se n se ! [d ia le ct al : S G er m an ] S ch w ar z 1 9 4 1 (T ) C f. # 2 3 1 4 L T u g en d ! A ch w as ! V ir tu e! O h co m e o n ! B au d is si n (e d . B ru n n er ) 1 9 4 7 (S ) 1 5 H H N ic h t d ie K ra ft ! Z u m la ch en ! N o t th e st re n g th ! L au g h ab le ! Z ey n ek ?- 1 9 4 8 (T ) 1 6 L ‘T u g en d ’? Q u at sc h ! ‘V ir tu e’ ? N o n se n se ! F la tt er 1 9 5 2 1 7 T u g en d ? W ei ß e M äu se ! V ir tu e? W h it e m ic e! R o th e 1 9 5 6 1 8 M ac h t? D u m m es Z eu g ! P o w er ? S tu ff an d n o n se n se ! S ch al le r 1 9 5 9 1 9 T u g en d ? K ei n e F ei g e w er t! V ir tu e? N o t w o rt h a fi g ! S ch rö d er 1 9 6 2 2 0 T u g en d ? fi ck d ra u f V ir tu e? fu ck it S w ac zy n n a 1 9 7 2 (T ) 2 1 H H In d ei n er M ac h t? A ch w as ! In y o u r p o w er ? O h co m e o n ! � F ri ed 1 9 7 2 (P ) C f. # 2 3 , # 2 7 . .. 2 2 T u g en d ! E in Q u ar k ! V ir tu e! Q u ar k ! [s o ft ch ee se /n o n se n se ] L au te rb ac h 1 9 7 3 (T ) C f. # 2 7 2 3 M ac h t? S ch m ar re n ! P o w er ? N o n se n se ! [d ia le ct al : S G er m an ] � E n g le r 1 9 7 6 (S ) 2 4 H H N ic h t in d ei n er M ac h t? S o n Q u at sc h ! N o t in y o u r p o w er ? W h at n o n se n se ! L au b e 1 9 7 8 (T ) C f. # 2 7 2 5 T u g en d ! E in D re ck ! V ir tu e! F il th ! [C ra p ] R ü d ig er 1 9 8 3 (T ) 2 6 L L T u g en d ? Q u at sc h V ir tu e? N o n se n se � B o lt e an d H am b lo ck 1 9 8 5 (S ) 2 7 H H N ic h t in d ei n er M ac h t? Q u ar k ! N o t in y o u r p o w er ? Q u ar k ! � G ü n th er 1 9 9 5 (P ) C f. # 3 1 , # 3 2 . 2 8 H D a k an n st d u la n g e b et en Y o u ca n p ra y a lo n g ti m e [¼ N o t u n ti l th e co w s co m e h o m e] M o ts ch ac h 1 9 9 2 (T ) 2 9 L A ff en k ra m A p e- ru b b is h [¼ C ra p !] B u h ss 1 9 9 6 (T ) 3 0 H H C h ar ak te r? A m A rs ch d er C h ar ak te r! C h ar ac te r? C h ar ac te r m y ar se ! � Z ai m o g lu an d S en k el 2 0 0 3 3 1 H H N ic h t in d ei n er M ac h t? Q u at sc h ! N o t in y o u r p o w er ? N o n se n se ! L eo n ar d 2 0 1 0 (T ) 3 2 N ic h t in d ei n er M ac h t! N o t in y o u r p o w er ! � S te ck el 2 0 1 2 H an d L in d ic at e h ig h es t (H ) an d lo w es t (L ) se v en E d d y v al u e ra n k in g s an d le n g th ra n k in g s. A lt er n at iv e tr an sl at io n s to ‘T u g en d ’ ¼ ‘v ir tu e’ ar e u n d er sc o re d . S o u rc es : � ¼ n o w in p ri n t. (S ) ¼ st u d y te x t. (T ) ¼ n o b o o k tr ad e d is tr ib u ti o n (t h ea tr e te x t) . In te rt ex ts : (P ) ¼ p re st ig io u s, in fl u en ti al . T. Cheesman et al. 12 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from the phrase ‘A fig!’ is gross sexual innuendo. ‘Fig’ meant vagina. The expression derives from Spanish and refers to an obscene hand gesture: in- tense affect (see Neill, 2006, p. 235). (The expression ‘I don’t give/care a fig!’ was once commonplace, and often used euphemistically for ‘fuck’, a word Shakespeare never uses.) The lowest and highest seven Eddy rankings are indicated. Eddy’s lowest-scoring translation is ‘Tugend? Quatsch’ (#16, #26). ‘Tugend’ is the modern dictionary translation of (moral) ‘virtue’. ‘Quatsch’ is a harmless expression of disagreement: a bowdlerized translation (bowdlerization is clear in most versions here).19 The Eddy score is low be- cause most translations (until 1985) use ‘Tugend’ and several also use ‘Quatsch’. Eddy’s highest score is for ‘Charakter? Am Arsch der Charakter!’ (#30). This is Zaimoglu’s controversial adaptation of 2003, with which Cheesman’s work on Othello began (2010). No other translation uses those words, including the preposition ‘am’ and article ‘der’. ‘Charakter’ accurately translates the main sense of Shakespeare’s ‘virtue’ here, and ‘Arsch’ fairly renders ‘A fig!’ This is among the philologic- ally informed translations of ‘virtue’ (as ‘energy’, ‘strength’, ‘power’), a series which begins with Schwarz (1941) (#13). It is also among the syntac- tically expansive translations, with colloquial speech rhythms, which begin with Zeynek (?-1948) (#15).20 Both series become predominant following the pres- tigious Fried (1972) (#21). Reading versions both historically and with Eddy, in our interface, makes for a powerful tool. Here the historical distribution of Eddy rankings confirms what we already know about changes in Shakespeare translation. The lowest mostly appear up to 1926. The highest mostly appear since 1972 (recall Figure 2: lower left quadrant). Ranking by length in typographical characters is not often useful, but with such a short segment its results are interesting, and similar to Eddy’s. Most of the shortest are up to 1947, and most of the longest since 1972: that shift towards more expansive, col- loquial translations, again. Similar historical Eddy results are found for many segments in our corpus. An ‘Eddy History’ graph, plotting versions’ average Eddy on a timeline, can be generated: it shows Eddy average rising in this corpus since about 1850. This may be a peculi- arity of German Shakespeare. It may be an artefact of the method. But it is conceivable that, with fur- ther work, the period of an unidentified translation might be predicted by examining its Eddy metrics. Eddy and Viv results for any selected segments, based on the full corpus or a selected subset of ver- sions, can be retrieved and explored in several forms of chart, table, and data export. The interactive ‘Eddy Variation’ chart, for example, facilitates com- parisons between one translator’s work and that of any set of others (e.g. her precursors and rivals). It plots Eddy results for selected versions against seg- ment position in the text; any version’s graph can be displayed or not (simplifying focus on the transla- tion of interest); when a node is brushed, the rele- vant bilingual segment text is displayed. Eddy’s weaknesses are evident in Table 1, too. It fails to highlight the only one-word translation (#29), or the one giving ‘fig’ for ‘fig’ (#19), or the one with the German equivalent of ‘fuck’ (#20), ex- pressing the obscenity which remains concealed from most German readers and audiences. We still need to sort ordinary translations from extraordin- ary and innovative ones in more sophisticated ways. Eddy also fails to throw light directly on genetic and other intertextual relations. Some are indicated in the ‘Intertexts’ column in Table 1: the probable in- fluence of some prestigious retranslations is appar- ent in several cases, as is the possible influence of some obscure ones. Such dependency relations re- quire different methods of analysis and representa- tion. Stylometric analysis (Section 3.2.2) provides pointers. More advanced methods must also en- compass negative influence, or significant non-imi- tation. Table 1 shows—and this result is typical too—that the canonical version (#5), the most often read and performed German Shakespeare text from 1832 until today, is ‘not’ copied or even closely varied. That is no doubt because of risk to a retranslator’s reputation. Retranslators must differ- entiate their work from what the public and the specialists know (Hanna, 2016). The tool we built is a prototype. Eddy is admit- tedly imperfect. But its real virtue lies in the power it gives to Viv, enabling us to investigate to what Multi-retranslations Digital Scholarship in the Humanities, 2016 13 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from extent base text features and properties might cor- relate with differences among translations. Even that is only a start, as Flanagan points out: Ebla can be used to calculate different kinds of variation statistics for base text segments based on aligned corpus content. These can potentially be aggregated-up for more coarse-grained use. The results can be navi- gated and explored using the visualization functionality in Prism. However, translation variation is just one of the corpus properties that could be investigated. Once aligned, the data could be analysed in many other ways. (Flanagan in: Cheesman et al., 2012–13) 4.2 ‘Viv’ in Venice An initial Viv analysis of Othello 1.3, involving all the ninety-two natural ‘speech’ segments, was re- ported (Cheesman, 2015).21 It found that the ‘highest’ Viv-value segments tended to be (1) near the start of the scene, (2) spoken by the Duke of Venice, who dominates that scene, but appears in no other, and (3) rhyming couplets (rather than blank verse or prose). There are twelve rhyming couplets in the scene; two are speech segments; both were in the top ten of ninety-two Viv results. No association was found between Viv value and perceptible attitudinal intensity, or any linguistic features. We did find some high-Viv segments asso- ciated with specific cross-cultural translation chal- lenges. Highest Viv was a speech by Iago with the phrase ‘silly gentleman’, which provokes many dif- ferent paraphrases. But some lower-Viv segments present similar difficulties, on the face of it. There was no clear correlation. Still, four hypotheses emerged for further research. Hypothesis 1: Based on rhyming couplets having high Viv-value: retranslators diverge more when they have additional poetic-formal constraints.22 Hypothesis 2: Based on finding (1) above: retran- slators diverge more at the start of a text or major chunk of text (i.e. at the start of a major task). Hypothesis 3: Based on finding (2) above: retran- slators diverge more in translating a very salient, local text feature in a structural chunk (in this scene: the part of the Duke) and less in translating global text features (e.g. here: Othello, Desdemona, Iago). Hypothesis 4 relates to ‘low’ Viv findings. It was somehow disappointing to find that speeches by the hero Othello and the heroine Desdemona, including passages which generate much editorial and critical discussion, had moderate, low, or very low Viv scores. Famous passages where Othello tells his life story and how he fell in love with Desdemona, or where Desdemona defies her father and insists on going to war with Othello, surely present key chal- lenges for retranslators. Perhaps passages which have been much discussed by commentators and editors pose less of a cognitive and interpretive challenge, as the options are clearly established.23 This hypothesis could be investigated by marking up passages with a metric based on the extent of associated annotation in editions and/or frequency of citation in other cor- pora. For now, we have speculated that the hero’s and heroine’s speeches in this particular scene do exhibit common attitudinal, not so much linguistic, but dramatic features. In the low-Viv segments, the characters can be seen to be taking care to express themselves particularly clearly; even if very emo- tional, they are controlling that emotion to control a dramatic situation. Perhaps translators respond to this ‘low affect’ by writing less differently? But it is difficult to quantify such a text feature and so check Viv results against any ‘ground truth’. There is another possible explanation: in the most ‘canonical’ parts of the text (here: the hero’s and heroine’s parts), retranslators perhaps tread a careful line between differentiating their work and limiting their divergence from prestigious precur- sors.24 Such ‘prestige cringe’ would relate to the above-mentioned negative influence, or non-imita- tion of the most prestigious translations (Section 3.4). Precursors act, paradoxically, as both negative and positive constraints on retranslators. Hypothesis 4: in the most canonical constituent parts of a work, Viv is low, as retranslators tend to combine willed distinctiveness with caution, limit- ing innovation. In the initial analysis, the groups of speeches as- signed highest and lowest Viv values had suspi- ciously similar lengths. Clearly the normalization of Eddy calculations for segment length leaves T. Cheesman et al. 14 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from T a b le 2 ‘V iv v al u es ’ in tw o li n er s in O th el lo 1 .3 g en er at ed b y tw en ty G er m an v er si o n s (c o n ti n u ed ) Multi-retranslations Digital Scholarship in the Humanities, 2016 15 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from something to be desired. The next and latest analysis focused on segments of similar length to investigate our hypotheses. 4.3 ‘Viv’ in two liners Table 2 shows the grammatically complete two-line verse passages in Othello 1.3, plus prose passages of equivalent length, 25 in Viv value rank order. A sub- corpus of twenty translations was selected for better comparability.26 The text assigned to each major character part here is reasonably representative of their overall part in the scene, counted in lines: Brabantio (sample eighteen lines [nine couplets]/ total sixty-one lines) 0.3, Desdemona (10/31) 0.32, Duke (22/67) 0.33, Iago (14/65) 0.21, Othello (20/ 108) 0.19. Hypothesis 1 seems to be confirmed, though more work needs to be done to prove it conclu- sively: high Viv value correlates with poetic-formal constraint. In the column ‘Form’ in Table 2, blank verse is the default. Unsurprisingly, rhyming coup- lets appear mostly in the top half of the table, including five of the top ten items. Translators enjoy responding to the formal challenge of rhym- ing couplets in self-differentiating ways; and they must so respond, or else they very obviously plagi- arize, because these items are rare in the text and highly noticeable, for audiences or readers. Hypothesis 2 is not confirmed: scanning the column ‘Running order’, there is no sign that trans- lators differentiate their work more at the start of the scene, as they embark on a new chunk of the task. That could have been interesting for psycho- linguistic and cognitive studies of translation (Halverson, 2008). Hypothesis 3 seems to be confirmed, but we need much more evidence to be sure we have discovered a general pattern. Scanning the column ‘Speaker’, the Duke’s segments are more variously translated than those of other speakers. Even if we exclude rhyming couplets, the Duke is over-represented in the upper part of the table. Brabantio and Iago also have some very high-Viv lines, but their segments are distributed evenly up and down the table. Not so with the Duke, who is the salient, local text feature in this scene and no other. T a b le 2 C o n ti n u ed T. Cheesman et al. 16 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from Hypothesis 4 also seems provisionally confirmed. Othello is strikingly low-Viv, mostly. Desdemona tends to be low- to mid-Viv. Translations of their parts differ ‘less’ than other parts, at this scale. Why? We do not know. It could be ‘prestige cringe’ (Section 4.2). But it could also be specific to this text. Othello in particular refuses ‘affect’ in this scene, as he does throughout the first half of the play: he is in command of everything, including his emotions. He echoes a much discussed line just spoken by Desdemona (‘I saw Othello’s visage in his mind’, 1.3.250) when he says to the Duke and assembled Senators that he wants her to go to war with him, but: I therefore beg it not, To please the palate of my appetite, Nor to comply with heat—the young affects In me defunct—and proper satisfaction. But to be free and bounteous to her mind: (. . .) (Othello 1.3.258–63) This is one of the play’s cruxes—passages which editors deem corrupt and variously resolve (here, ‘me’ is often changed to ‘my’, ‘defunct’ to ‘distinct’, and the punctuation revised).27 Translators also re- solve this passage variously, depending in part on which edition(s) they work with; but—as measured by Viv—not very variously, compared with other passages. Can it be that textual ‘affect’ is relatively less, because that is the kind of character, the mind, the ‘virtue’ Othello is projecting? 5 Concluding Comments Findings which only confirmed what was already known would be truly disappointing (though we do need some such confirmation, to have any faith in digital tools). Digital literary studies should provoke thought. A classic example is Moretti’s discovery of a rhythm of 25–30 years in the emergence and disappearance of C19 novelistic genres, which he uneasily ascribed to a cycle of bio- logical-sociocultural ‘generations’: I close on a note of perplexity: faute de mieux, some kind of generational mechanism seems the best way to account for the regularity of the novelistic cycle—but ‘generation’ is itself a very questionable concept. Clearly, we must do better. (Moretti, 2003, p. 82) So too with ‘Translation Arrays’ and ‘Version Variation Visualization’: we must do better. We wanted to demonstrate that this sort of ap- proach opens up interesting possibilities for future research.28 Of course one big difference between Moretti’s work and ours so far is one of scale. His team works with tens or hundreds of thousands of texts and metadata items. We are working with a few dozen versions of one play, in one target lan- guage, because that is what we have got,29 and only a fragment of the play, because we chose to make the texts publicly accessible, which entails copyright re- strictions (and some expense). Our approach re- quires time-consuming text curation (correction of digital surrogates against page images), 30 permission acquisition, and manual segmentation and align- ment processes (more sophisticated approaches including machine learning will speed these up).31 Moretti experimentally ‘operationalizes’ pre- digital critical concepts such as ‘character-space’ or ‘tragic collision’ (Moretti, 2013), by measuring quantities in texts: digital proxies or analogues. Eddy and Viv, on the other hand, are measuring relational corpus properties which have no obvious pre-digital analogue. What could they be proxies for? Eddy makes visible certain kinds of resemblance and difference, certain sequences, patterns of influ- ence and distinctiveness. Critically understanding these still depends on understanding ‘para- and meta-texts’ (Li, Zhang and Liu, 2011). Viv’s contri- bution is even less certain: we won’t know whether its results correspond to anything ‘real’ about trans- lated texts’ qualities, or those of translations, or of translators, until we have studied many more cases. Eddy and Viv analysis, as implemented, is crude. We can imagine training next-generation Eddy on human-evaluated variant translations. We can en- visage experiments with lemmatization, stopword exclusion, parsing, morphosyntactical tagging,32 di- verse automated segment definitions, text analytics, and plugging in other corpora for richer analyses. When does a translator’s use of language mimic a Multi-retranslations Digital Scholarship in the Humanities, 2016 17 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from pre-existing style, when is it innovative, in what way? We can map texts to Wordnets, historical dic- tionaries and thesauri. We can model topics, analyse sentiments. We can explore consistency and coher- ence within translations, usage of less common words, word-classes, word-sets, grammatical, rhet- orical, poetic, prosodic, metrical, metaphorical fea- tures, and so on. We can generate intertextual and phylogenetic trees. We can perhaps adjust Viv for historical sequence, and weight for the complex ef- fects of influence, imitation, and intentional non- imitation. Given multi-lingual parallel corpora, we can project a cross-cultural Viv. The more sophisti- cated the analysis, the greater its scope, the greater the cost of text preparation and annotation, and the greater the challenge in creating visual interfaces which offer value to non-programmers. For text re- sources on a scale which might justify such invest- ment, we must next look to scripture. Then we will need experts in God’s domain, as well. Funding This work was supported by Swansea University (Research Incentive Fund and Bridging the Gaps), and the main phase of software development was funded by a 6-month Research Development Grant in 2012 under the Digital Transformations theme of the Arts and Humanities Research Council (UK), reference AH/J012483/1. References Algee-Hewitt, M., Allison, S., Gemma, M., Heuser, R., Moretti, F., and Walser, H. (2016). Canon/ archive: large-scale dynamics in the literary field. Literary Lab Pamphlet, Vol. 11. http://litlab.stan- ford.edu/LiteraryLabPamphlet11.pdf (accessed 16 January 2016). Altintas, K., Can, F., and Patton, J. M. (2007). Language change quantification using time-separated parallel translations. Literary and Linguistic Computing, 22(4): 375–93. Babych, B. and Hartley, A. (2004). Modelling legitimate translation variation for automatic evaluation of MT quality. Proceedings of LREC 2004, pp. 833–6. http:// www.lrec-conf.org/proceedings/lrec2004/pdf/707.pdf (accessed 16 January 2016). Babych, B., Hartley, A., and Atwell, E. (2003). Statistical modelling of MT output corpora for information ex- traction. In Archer, D., Rayson, P., Wilson, A., and McEnery, T. (eds), Proceedings of the Corpus Linguistics 2003 Conference, Lancaster University, 28– 31 March 2003, pp. 62–70. http://ucrel.lancs.ac.uk/pub- lications/CL2003/papers/babych.pdf (accessed 16 January 2016). Baker, M. (1993). Corpus linguistics and translation stu- dies: implications and applications. In Baker, M., Francis, G., and Tognini-Bonelli, E. (eds), Text and Technology: In Honour of John Sinclair. Amsterdam and Philadelphia: John Benjamins, pp. 233–250. Baker, M. (2000). Towards a methodology for investigating the style of a literary translator. Target, 12(2): 241–66. Baudissin, W. (1832). Shakspeares dramatische Werke. Vol. 8. Berlin: Reimer. Berman, A. (1990). La Retraduction comme espace de traduction. Palimpsestes, 13: 1–7. Bodenstedt, F. (1867). Othello, der Mohr von Venedig. Leipzig: Brockhaus. Bolte, H. and Hamblock, D. (1985). Othello: Englisch- Deutsch. Stuttgart: Philipp Reclam. Brownlie, S. (2006). Narrative theory and retranslation theory. Across Languages and Cultures, 7(2): 145–70. Buhss, W. (1996). William Shakespeare Othello, Venedigs Neger. Berlin: Henschel Schauspiel Theaterverlag. Cheesman, T. (2010). Shakespeare and Othello in Filthy Hell: Zaimoglu and Senkel’s Politico-Religious Tradaptation. Forum for Modern Language Studies, 46(2): 207–20. Cheesman, T. (2011). Thirty Times ‘More Fair Than Black’: Othello Re-translation as Political Re-statement. Angermion, 4: 1–52. Cheesman, T. (2015). Reading originals by the light of translations. Shakespeare Survey, 68: 87–98. Cheesman, T. (2016). Othello 1.3: ‘Far More Fair Than Black’. In Smith, B. R. (ed.), The Cambridge Guide to the Worlds of Shakespeare, vol. 2. The World’s Shakespeare, 1660-Present. Cambridge: Cambridge University Press, pp. 1156–61. Cheesman, T. and the VVV Project Team (2012). Translation Sorting with Eddy and Viv. http://www. scribd.com/doc/101114673/Eddy-and-Viv (accessed 16 January 2016). T. Cheesman et al. 18 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from Cheesman, T., Flanagan, K., and Thiel, S. (2012–13). Translation Array Prototype 1: Project Overview. http://www.delightedbeauty.org/vvvclosed/Home/ Project (accessed 16 January 2016). Cheesman, T., Flanagan, K., Thiel, S., and Rybicki, J. (2016), Five maps of translations of Shakespeare. In Wiggin, B. and Macleod, C. (eds), Un/Translatables: New Maps for Germanic Literatures. Evanston, IL: Northwestern University Press, forthcoming. Deane-Cox, S. (2014). Retranslation: Literature and Reinterpretation. London: Bloomsbury. Eder, M., Kestemont, M., and Rybicki, J. (2016). Stylometry with R: a package for computational text analysis. R Journal, 16(1), forthcoming. Engel, E. (1939). William Shakespeare Othello. Berlin: Felix Bloch Erben. Engler, B. (1976). Othello: Englisch-deutsche Studienausgabe. Munich: Franke. Farwell, D. and Helmreich, S. (2015). Pragmatics-based machine translation. In Chan, S. (ed.), The Routledge Encyclopedia of Translation Technology. Abingdon/New York: Routledge, pp. 167–85. Felsenstein, W. and Stueber, C. (1964). Giuseppe Verdi: Othello. Milan and Frankfurt am Main: Ricordi. Flatter, R. (1952). Othello der Mohr von Venedig. Sonderabdruck für Bühnenzwecke. Munich: Theater- Verlag Desch. Fried, E. (1972). Hamlet/Othello. Berlin: Wagenbach. Geng, Z., Laramee, R. S., Cheesman, T., Ehrmann, E., and Berry, D. M. (2011). Visualizing translation vari- ation: Shakespeare’s Othello. Advances in visual com- puting. Lecture Notes in Computer Science, 6938: 653–63. Geng, Z., Laramee, R.S., Flanagan, K., Thiel, S., and Cheesman, T. (2015). ShakerVis: visual analysis of seg- ment variation of German translations of Shakespeare’s Othello. Information Visualization, 14(4): 273–88. Grant, C. B. (2007). Uncertainty and Communication: New Theoretical Investigations. Basingstoke: Palgrave Macmillan. Gundolf, F. (1909). Shakespeare in deutscher Sprache. Vol. 1. Berlin: Bondi, 1920. Günther, F. (1995). William Shakespeare. Othello. Zweisprachige Ausgabe. Munich: Deutscher Taschenbuch Verlag. Hadas, E. (2015). Word2Dream: A Reader’s Companion. http://eranhadas.com/word2dream (accessed 16 January 2016). Halverson, S. L. (2008). Psycholinguistic and cognitive approaches. In Baker, M. and Saldanha, G. (eds), Routledge Encyclopedia of Translation Studies. London and New York: Routledge, pp. 211–16. Hanna, S. (2016). Bourdieu in Translation Studies: The Socio-cultural Dynamics of Shakespeare Translation in Egypt. London and New York: Routledge. Hope, J. and Witmore, M. (2014). The language of Macbeth. In Thompson, A. (ed.), Macbeth: The State of Play. London: Bloomsbury (Arden), pp. 183–208. House, J. (1997). Translation Quality Assessment: A Model Revisited. Tübingen: Narr. Hutchings, T. (2015). Studying apps: research approaches to the digital Bible. In Cheruvallil- Contractor, S. and Shakkour, S. (eds), Digital Methodologies in the Sociology of Religion. London: Bloomsbury, chapter 9. Jänicke, S., Geßner, A., Franzini, G., Terras, M., Mahony, S. and Scheuermann, G. (2015). TRAViz: a visualization for variant graphs. Digital Scholarship in the Humanities, 30(Suppl 1): i83–99. http://dsh.oxfordjournals.org/ content/30/suppl_1/i83 (accessed 16 January 2016). Johannson, S. (2011). Between Scylla and Charybdis: on individual variation in translation. Languages in Contrast, 11(1): 3–19. Kruger, A., Wallmach, K., and Munday, J. (2011). Corpus-based Translation Studies: Research and Applications. London: Continuum. Lapshinova-Koltunski, E. (2013). VARTRA: a compar- able corpus for analysis of translation variation. Proceedings of the 6th Workshop on Building and Using Comparable Corpora, Sofia, Bulgaria, August 8, 2013, pp. 77–86. http://aclweb.org/anthology/W13- 2510 (accessed 16 January 2016). Laube, H. (1978). William Shakespeare Othello Der Mohr von Venedig übersetzt und bearbeitet. Frankfurt am Main: Verlag der Autoren. Lauterbach, E. S. (1973). William Shakespeare, Othello, der Mohr von Venedig. Aus dem Englischen von E. S. Lauterbach unter Mitarbeit von Benita Gleisberg. Berlin: Henschel Schauspiel Theaterverlag. Li, D., Zhang, C., and Liu, K. (2011). Translation style and ideology: a corpus-assisted analysis of two English translations of Hongloumeng. Literary and Linguistic Computing, 26(2): 153–166. Long, L. (2007). Revealing commentary through com- parative textual analysis: the realm of Bible translations. Palimpsestes, 20: 35–57. Multi-retranslations Digital Scholarship in the Humanities, 2016 19 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from Matthiessen, C.M.I.M. (2001). The environments of translation. In Steiner, E. and Yallop, C. (eds), Beyond Content: Exploring Translation and Multilingual Text Production. Berlin and New York: Mouton de Gruyter, pp. 41–124. Mathijssen, J. (2007). The Breach and the Observance: Theatre Retranslation as a Strategy of Artistic Differentiation, with Special Reference to Retranslations of Shakespeare’s Hamlet (1777-2001). PhD dissertation, Utrecht University. Moretti, F. (2003). Graphs, maps, trees: abstract models for literary history 1. New Left Review, 24: 67–93. Moretti, F. (2013). ‘Operationalizing’: or, the Function of Measurement in Modern Literary Theory. Literary Lab Pamphlet, 6. http://litlab.stanford.edu/LiteraryLab Pamphlet6.pdf (accessed 16 January 2016). Morini, M. 2014. The Pragmatic Translator: An Integral Theory of Translation. London: Bloomsbury. Motschach, H. (1992). William Shakespeare Othello. Munich: Drei Masken. Mueller, M. (2003-14). About Metadata and the Query Potential of the Digital Surrogate. http://wordhoard. northwestern.edu/userman/index.html (accessed 16 January 2016). Munday, J. (2012). Evaluation in Translation. Critical Points of Translation Decision-making. Abingdon; New York: Routledge. Neill, M. (ed.) (2006). The Oxford Shakespeare: Othello. Oxford: Oxford University Press. O’Driscoll, K. (2011). Retranslation through the Centuries: Jules Verne in English. Oxford: Peter Lang. Paloposki, O. and Koskinen, K. (2010). Reprocessing texts: the fine line between retranslating and revising. Across Languages and Cultures, 11(1): 29–48. Penrose, R. (1989). The Emperor’s New Mind: Concerning Computers, Minds and the Laws of Physics. Oxford: Oxford University Press. Roos, A. (2015). Making a Clean Breast of English Passover Haggadah Translations: Data Visualization of Bowdlerization in Haggadah Translations of Ezekiel 16:7. Unpublished chapter of PhD dissertation, University of Amsterdam. Rothe, H. (1956). Der Elisabethanische Shakespeare. Vol. 4. Baden-Baden: Holle. Rüdiger, R. (1983). William Shakespeare Othello, der Mohr von Venedig Tragödie. In Anlehnung an die Übersetzung von Friedrich von Bodenstedt Nach dem Original neu übersetzt. Berlin: Felix Bloch Erben. Rybicki, J. (2012). The great mystery of the (almost) in- visible translator: stylometry in translation. In Oakley, M. and Ji, M. (eds.), Quantitative Methods in Corpus- Based Translation Studies. Amsterdam: John Benjamins, pp. 231–48. Rybicki, J. and Heydel, M. (2013). The stylistics and styl- ometry of collaborative translation: Woolf’s ‘Night and Day’ in Polish. Literary and Linguistic Computing, 28(4): 708–17. Sayer, J. (2015). Wolf Graf Baudissin (1789-1878): Life and Legacy. Münster: LIT. Schaller, R. (1959). Shakespeares Werke. Vol. 4. Berlin: Rütten & Loening. Schröder, R. A. (1962). Shakespeare/deutsch. Berlin, Frankfurt am Main: Suhrkamp. Schwarz, H. (1941). Othello, der Maure von Venedig. Typescript. Shakespeare-Bibliothek München. Shei, C-C. and Pain, H. (2002). Computer-assisted teach- ing of translation methods. Literary and Linguistic Computing, 17(3): 323–43. Steckel, F. (2012). Die Tragödie von Othello, dem Mohren von Venedig. Frankfurt am Main: Verlag der Autoren. Swaczynna, W. (1972). Die Tragödie von Othello, dem Mohren von Venedig. Cologne: Jussenhoven & Fischer. Thiel, S. (2010). Understanding Shakespeare: towards a visual form for dramatic texts and language. http://www.under- standing-shakespeare.com (accessed 16 January 2016). Thiel, S. (2012). Othello map. http://othellomap.nand.io/ (accessed 16 January 2016). Thiel, S. (2014a). Visualizing translation variation. http:// transvis.s3-website-eu-west-1.amazonaws.com/# or www. tinyurl.com/transvis2014 (accessed 16 January 2016). Thiel, S. (2014b). Visualizing Translation Variation: Designing Tools for Literary Scholars in Translation Studies and Linguistics. Masters Dissertation. Bauhaus University Weimar. Thiel, S. (2015). Macbeth Loglikelihoods. http://macbeth. s3-website-eu-west-1.amazonaws.com/or www.tinyurl. com/macbthe (accessed 16 January 2016). Toury, G. (2012). Descriptive Translation Studies—and Beyond. Revised edition. Amsterdam and Philadelphia: Benjamin. Venuti, L. (2004). Retranslations: The Creation of Value. Bucknell Review, 47(1): 25–38. Venuti, L. (2008). The Translator’s Invisibility: A History of Translation. Revised edition. London and New York: Routledge. T. Cheesman et al. 20 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from Von Ledebur, R. (2002). Der Mythos vom deutschen Shakespeare: die Deutsche Shakespeare-Gesellschaft zwischen Politik und Wissenschaft 1918-1945. Cologne and Weimar: Böhlau. Wachsmann, M. (2005). William Shakespeare, Die Tragödie von Othello, dem Mohr von Venedig. Berlin: Gustav Kiepenheuer Bühnenvertriebs-Gmbh. Wang, Q. and Li, D. (2012). Looking for translators’ fin- gerprints: a corpus-based study on chinese translations of ulysses. Literary and Linguistic Computing, 27(1): 81–93. Wolff, M. J. (1926). Shakespeares Werke übertragen nach Schlegel-Tieck. Vol. 14. Berlin: Volksverband der Bücherfreunde, Wegweiser-Verlag. Zaimoglu, F. and Senkel, G. (2003).William Shakespeare Othello. Bearbeitung. Münster: Monsenstein und Vannerdat. Zeynek, T. (?-1948). Shakespeare: Othello Der Mohr von Venedig. Munich: Ahn und Simrock Bühnen und Musikverlag. Zimmer, H. (2007). Othello steht im Sturm: Jugendstück frei nach Shakespeare. Weinheim: Deutscher Theaterverlag Weinheim. Notes 1. ‘Version Variation Visualization: Translation Array Prototype 1’ at http://www.delightedbeauty.org/ vvvclosed. Further project links: www.tinyurl.com/ vvvex. Alternative prototype tools were also built: see Geng et al., 2011, 2015. See further: Cheesman, 2015, 2016, and Cheesman et al., 2016. 2. The existence of multilingual (re)translations can indicate both popularity and prestige, as in publishers’ blurbs for novels ‘translated into X languages’. For the Stanford Literary Lab, translations index popularity (Algee- Hewitt et al., 2016, p. 3). But ‘multiple’ retranslations often also mean prestige: some are included in institu- tional curricula, reviewed in ‘high-brow’ media, etc. 3. For example, 1,096 versions of the Bible in 781 lan- guages at www.bible.com or approx. 170 versions of the Quran in forty-seven languages at http://al-quran. info. See Long (2007) and Hutchings (2015). 4. Venuti (2004) focuses on retranslations which deliber- ately challenge pre-existing translations. Our corpus is not so restricted. 5. See also Wang and Li (2012): digitally supported ana- lysis of two Chinese translations of James Joyce’s Ulysses. 6. For details of the forty plus German texts used, see www.delightedbeauty.org (‘German’ page). 7. ‘If virtue no delighted beauty lack, / Your son-in-law is far more fair than black’ (Othello 1.3.287–8). Multilingual translations of this are crowd-sourced by Cheesman at: www.delightedbeauty.org. 8. This remains less easy than we would wish. Roos is working with Eran Hadas on a more user-friendly corpus-creation, segmenting and aligning interface, in the course of a study of English translations of the Hebrew Haggadah from the C18 to now, also using tools such as TRAViz (Jänicke et al., 2015) and Word2Dream (Hadas, 2015). See Roos, 2015, and http://www.tinyurl.com/JewishDH. 9. Cheesman collated MIT’s ‘Moby’ Shakespeare (http:// shakespeare.mit.edu) with Neill’s edition (2006) for added dialogue and modern spellings. We chose to sample Othello 1.3 partly because the English text is stable between editions, at the level of speeches and speech prefixes, if not at the level of wording (except at 1.3.275–6—see Neill, 2006, p. 232); also for its var- iety of major character parts. 10. http://www.juxtasoftware.org. Juxta helps map phyl- ogeny, with the aim of (re)constructing an original or an authoritative edition. We cannot study retransla- tions with any such aim. There is no right translation. There may be a canonical translation, but users feel free to revise it, because it is ‘just’ a translation. 11. The potential value of this interface to support ex- plorations of text-analytic features is illustrated by the ‘Macbthe’ interface (Thiel, 2015): users explore a zoomable map of ‘Macbeth’ with a log likelihood lemma table, following the impetus of Hope and Witmore (2014). See also Thiel’s (2010) earlier work. 12. See: Eder et al. (2016) and stylometric translations analyses by Rybicki (2012) and Rybicki and Heydel (2013). 13. On the ‘fine line between retranslation and revision’ see: Paloposki and Koskinen, 2010. There is no re- search on Wolff, or indeed on most of the translators here. 14. Cheesman named Eddy after (1) a formula he primi- tively devised as ‘ P D’, adapting tf.idf formulae (see: Cheesman and the VVV Project Team, 2012, p. 3), (2) his brother Eddy, and (3) the idea that retranslations are metaphorical ‘eddies’ in cultural historical flows. 15. Formulae available: A: Euclidean distance; B: Cheesman’s original, primitive formula; C: Viv as standard deviation of Eddy; D: Dice’s coefficient; E: angular distance. 16. ‘A normalisation needs to be applied to compensate for the effect of text length, [so] we calculated Multi-retranslations Digital Scholarship in the Humanities, 2016 21 of 22 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from variation for a large number of base text segments of varying lengths, then plotted average [Euclidean] Eddy value against segment length. We found a loga- rithmic relationship between the two, and arrived at a normalisation function that gives an acceptably con- sistent average Eddy value regardless of text length’ (Flanagan in Cheesman et al., 2012–13). Eddy for- mula E (angular distance) appears to address the length normalization problem to some extent. 17. Stephen Ramsay commented on the ‘graceful and illuminating’ interface that ‘prompts various kinds of ‘‘noticing’’ and encourages an essentially playful and exploratory approach to the ‘‘data’’’ (personal correspondence, 26 May 2014). 18. Neill glosses ‘virtue’ as ‘moral excellence’, ‘manly strength and courage’, and ‘inherent nature’ at 1.3.287; ‘power, strength of character’ at 1.3.315 (Neill, 2006, p. 233 and 235—see there also for ‘fig’). 19. Roos (2015) uses Eddy and Viv to explore bowdler- ization in English Haggadah texts. 20. Zeynek died in 1948; his translations are undated. 21. Stylometry and common sense recommended nar- rowing the corpus to give less ‘noisy’ results. I excluded prose study versions, adaptations with ex- tensive omissions, contractions, expansions and add- itions, C18 and C19 versions, including all versions of Baudissin (1832), leaving fifteen versions: Gundolf (1909), Schwarz (1941), Zeynek (?-1948), Flatter (1952), Rothe (1956), Schaller (1959), Schröder (1962), Fried (1972), Swaczynna (1972), Laube (1978), Rüdiger (1983), Motschach (1992), Günther (1995), Buhss (1996), Wachsmann (2005). 22. The norm in German Shakespeare translation is that formal variation in the original (prose, blank verse, rhymed verse, or another metrical scheme) should be replicated or analogously marked. Roos (2015) reports similar findings for the Haggadah: rhyming verse sec- tions have higher Viv, if translators use rhyme. 23. We thank a DSH referee for pointing out this possibility. 24. Roos (2015) similarly finds lower Viv value in Bible quotations (the most canonical segments) in Haggadah translations. 25. Based on the two-line verse segments found manu- ally, the length range was set at 60–100 characters. Iago’s lengthy prose speeches include more examples than were segmented and aligned. 26. Baudissin (five versions, 1855–2000) was added to the corpus previously used, to recognize this translation’s enduring relevance. 27. See: Neill, 2006, p. 231. The MIT text (from an 1860s edition) is quoted, but with Neill’s line-numbering. 28. We also envisage training applications. An interface enabling trainee translators and trainers to compare versions would have great practical value, as an ad- junct to a computer-assisted translation system and/ or an assessment and feedback system. 29. Shakespeare retranslations are found at scattered sites. Larger, curated corpora are accessible in Czech and Russian: c.400 aligned texts (twenty-two versions of Hamlet) at http://www.phil.muni.cz/kapradi; c.200 texts (twelve versions of Hamlet) at http://rus-shake. ru/translations. 30. The term ‘surrogate’ is taken from Mueller (2003–14). Ideally our system would include page images. 31. Roos is working on this with Eran Hadas. 32. Difficulties include in-text variants (e.g. in critical editions, or translators’, directors’, and actors’ copies) and orthographic variations (archaic and vari- ously modernized forms; ad hoc forms fitting met- rical rules; other non-standard forms). Rather than standardize texts to facilitate comparisons, the ma- chine should learn to recognize underlying equivalences. T. Cheesman et al. 22 of 22 Digital Scholarship in the Humanities, 2016 by guest on O ctober 10, 2016 http://dsh.oxfordjournals.org/ D ow nloaded from